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Features 

32-BIT FLOATING POINT PROCESSOR 
Single-precision floating point multiplier/ALU 
Four-port 32x32 register file 
IEEE floating point format 
Low power, high integration CMOS 


October 1987 


HIGH PERFORMANCE 

80, 100 and 120 ns cycle times 

Up to 25 MFLOPS throughput (1 MAC/cycle) 

• Low latency (3-cycle register-to-register operation) 
High I/O bandwidth (up to 200 Mbytes/sec) 


FULL FUNCTION 

Add, subtract, multiply, multiply/accumulate 
Divide look-up table 

Type conversion to and from two’s complement integer 
Three-address (rc := ra + rb) architecture 
Flexible I/O options 


MULTI-PURPOSE 

For maximum throughput, use the three-port 
WTL 3332 

For maximum design flexibility, use the WTL 3132 or 
WTL 3332 as microprogrammable building blocks 

For high level language support, use the XL-3132 as 
the XL-8032 floating point coprocessor 


Description 

The WTL 3132 and WTL 3332 are single-precision 
floating point data paths. Each includes a pipelined 
multiplier/accumulator and a four-port register file with 
thirty-two 32-bit registers. 

The WTL 3132/3332 are suited to a wide range of sys¬ 
tems that need high numeric processing performance. 
They may be adopted as the floating point unit for a 
general-purpose processor, used as building blocks for 
application-specific data paths or even connected to¬ 
gether to create vector or array processors. 

The WTL 3132 has a single bi-directional 32-bit input/ 
output port. It is designed to be used as a floating point 
coprocessor or accelerator. The WTL 3332 has three 
32-bit ports; one bi-directional input/output port, one 
input port and one output port. It should be used in 
applications which require multiple high-bandwidth 
buses. 

The XL-3132 may also be used with the WEITEK 
XL-8136 program sequencing unit (PSU) and 
XL-8137 integer processing unit (IPU) to create a fast, 
general-purpose numeric processor, the XL-8032. Full 
development system support, including FORTRAN and 
C compilers, is available for the XL-Series of proces¬ 
sors. The XL-3132 is functionally identical to the WTL 
3132. 

Both devices are manufactured in low power CMOS 
and are available in standard pin grid array (PGA) 


packages. The WTL 3132 is supplied in a 144-pin PGA 
and the WTL 3332 in a 168-pin PGA. 


X Port 



Figure 1. WTL 3132/3332 core functions 
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Architecture 


MULTIPLIER/ACCUMULATOR 

The core of both the WTL 3132 and WTL 3332 is the 
multiplier/accumulator pipeline. Its first stage can mul¬ 
tiply two operands together. The next stage can add or 
subtract another operand. Finally, the result is rounded 
and returned to a register and/or output port. 

Multiply, add, subtract, and multiply/accumulate op¬ 
erations are performed in the multiplier/accumulator. 
They all operate on data that conforms to the IEEE 
single-precision floating point format. 

Each operation takes three cycles, but, because the 
multiplier/accumulator is pipelined, a new operation 
can be started on every cycle. At any time, three inde¬ 
pendent operations may be at different stages in their 
execution. 

Rounding, conversion between floating point and two’s 
complement integer formats, and other miscellaneous 
functions are supported in the accumulator. 

REGISTER FILE 

Operands and results of the multiplier/accumulator 
may be stored in the four-port register file. This file 


contains thirty-two registers, each of which may store a 
32-bit value. 

The four ports allow the register file to supply two oper¬ 
ands to the multiplier/accumulator, store its result back 
to a register, and perform an input/output transfer—all 
in the same cycle. 

INPUT/OUTPUT PORTS 

The external I/O ports are all 32 bits wide. They can 
each transfer a data value on every cycle. 

The WTL 3132 has one bi-directional external port; 
the X port. It can load and store data to and from the 
register file, and it can transfer data directly to and 
from the multiplier/accumulator. 

The WTL 3332 has three external ports; the X port, 
the Y port, and the 2 port. The X port is the same as 
the WTL 3132’s X port. The Y port feeds input oper¬ 
ands directly to the multiplier/accumulator. The Z port 
outputs results directly from the multiplier/accumula¬ 
tor. These additional ports help to avoid the bottle¬ 
necks usually associated with I/O-intensive algorithms. 



Figure 2. WTL 3132/3332 I/O options 
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Architecture, continued 


TEMPORARY REGISTERS 

Three 32-bit temporary registers are provided to store 
intermediate results. They make it possible to perform 
operations of the form 

X = X ± (yXz) 

in a single cycle. 

DIVIDE LOOK-UP TABLE 

Support for divide operations is provided by an on-chip 
look-up table. It returns an approximation for the in¬ 
verse of a value which may then be refined by iterative 
multiply/accumulate operations. Division is accom¬ 
plished by multiplying the dividend by the inverse of 
the divisor. This complete divide operation takes eight¬ 
een cycles: other operations may be interleaved with¬ 
out a performance penalty. 

INSTRUCTIONS 

An instruction is latched into the code port on every 
cycle, It specifies operand sources, a result destination 
and all of the steps that will create this result during the 
next three cycles. Condition codes and exceptions may 
be generated by each operation as the result is written 
back to the register file. 

Four five-bit fields provide addresses for the register 
file. They each select a source or destination for one of 
the register ports. The three-bit function field specifies 
the type of multiplier/accumulator operation. The two- 
bit I/O control field directs data transfer at the external 
X port. Other fields select the route taken by the data 
during the the operation. 

A Mode Register controls data routing options that 
rarely change, selecting between a number of I/O tim¬ 


ing options, and supporting upward-compatibility with 
previous versions of the WTL 3132/3332. 

XL-SERIES COMPATIBILITY 

The XL-3132 may be used with the WEITEK XL-8136 
program sequencing unit (PSU) and XL-8137 integer 
processing unit (IPU) to create the XL-8032 processor. 

The XL-3132 floating point unit (FPU) shares a 64-bit 
instruction word with the IPU and PSU. The IPU and 
FPU also share the 32-bit-wide data bus. 

The XL-3132 responds to the NEUT-, STALL- and 
ABORT- signals used to control branching and “wait 
states” within the XL-8032. It communicates its status 
to the PSU with the floating point condition (FPCN) 
and exception (FPEX) lines. See Appendix A for more 
about the XL-Series. 



Figure 3. XL-8032 block diagram 
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Signal Description 

X PORT 

The 32-bit X31..0 port is a bi-directional data bus. Input 
data is sampled on the rising edge of CLK (or, if Dou¬ 
ble-Pump Mode is enabled, both on the rising and fall¬ 
ing edges of CLK). Data transfers are controlled by the 
IOCti..o field in the instruction word. The X port may 
be set to a high impedance state by the OEX- signal. 
Active high. 

Y PORT 

The 32-bit Y31..0 port is a data input bus. Input data is 
sampled on the rising edge of CLK (or, if the Y port 
Late Input Mode is enabled, on the falling edge of 
CLK). The Y port is only available on the WTL 3332. 
Active high. 

Z PORT 

The 32-bit 23i..o port is a data output bus. The output 
data is modified on every cycle. The Z port is only 
available on the WTL 3332. It may be set to a high 
impedance state by the OEZ- signal. Active high. 

C PORT 

The 35-bit C34..0 port is used as a code input bus, In¬ 
structions are latched the rising edge of CLK. Active 
high. C34 is only available on the WTL 3332 (see fig¬ 
ure 40). 

OEX- 

X port output enable input. OEX- asynchronously dis¬ 
ables the X port when high. Active low. 

OEZ- 

Z port output enable input. OEZ- asynchronously dis¬ 
ables the Z port when high. Active low. 

FPEX(-) 

Floating point exception output. FPEX signals the oc¬ 
currence of an enabled exception (overflow). Polarity 
is selectable by mode bit Ms. 


FPCN 

Floating point condition output. FPCN signals the oc¬ 
currence of a condition as specified in the Encni.,o 
field of an instruction. Active high. 

ZERO 

Zero condition output. Indicates that the result of an 
operation is exactly equal to zero. Controlled by the 
Encni,.o field of an instruction. Active high. 

NEUT- 

Neutralize input. Cancels the effect of the current in¬ 
struction. Typically used during delayed branches and 
interrupt response routines (see page 27). Latched on 
the cycle following the instruction to be cancelled. Ac¬ 
tive low. 

STALL- 

Stall input. Cancels the effect of the next instruction. 
Typically used as a “not ready” line from the code 
store (see page 28). Latched on the same cycle as the 
potentially invalid instruction. Active low. 

ABORT- 

Abort input. Cancels the effect of both the current and 
next instructions. Typically used as a "not ready" line 
from the data store (see page 29). Latched on the 
same cycle as the next instruction. Active low. 

CLK 

Clock input. TTL compatible. 

VDD 

All VDD pins must be connected to 5.0V. 

GND 

All GND pins must be connected to system ground. 
Note; Signals denoted by are active low. 
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Block Diagram 



Figure 4. WTL 3132 block diagram 
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Figure 5. WTL 3332 block diagram 
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Register File 

The WTL 3132 and WTL 3332 each have thirty-two 
32-bit general-purpose registers. Each register can store 
either a single-precision IEEE value or a two’s comple¬ 
ment integer value. 

PORTS 

The register file has four ports, A, B, C and D. The A 
and B ports are read-only, the C port is write-only, and 
the D port is bi-directional. Each port can transfer a 
32-bit data word on every clock cycle. 

The A and B ports may be used to supply operands to 
the multiplier/accumulator and the divide look-up ta¬ 
ble. The C port receives the result of a previous opera¬ 
tion. The D port communicates data between the regis¬ 
ter file and the external X port. 

This organization allows I/O transfers to proceed in 
parallel with calculation, maximizing system perform¬ 
ance. 

REGISTER SELECTION 

The registers that are to take part in each transfer are 
selected by the instruction word. The instruction for¬ 
mat allows a register address to be supplied for each 
port. They are provided in the Aadd, Badd, Cadd and 
Dadd fields of the instruction. These fields are five bits 
in length, allowing each address to specify any of the 
thirty-two registers. 

An instruction supplies the Aadd, Badd, and Dadd ad¬ 
dresses to the register file during its first cycle and the 
Cadd address during its fourth cycle. This way a single 
instruction specifies all of the stages of an operation 
from initial source to ultimate destination. 

It is possible for a register to be selected by more than 
one field in the same cycle, in which case the following 
rules apply: 

1. If only read operations are to be performed on the 
register in question, then its value is copied to all of 
the necessary ports. 

2. If two ports (C and D) attempt to write into the 
same register on the same cycle, the contents of the 
register will be left in an undefined state. Such con¬ 
tention should be avoided. 

3. If a register is to be both read and written on the 
same cycle, its old value will be read before it is up¬ 
dated to the new value, unless one of the Bypass 
Modes is activated (see page 16). 


PRELIMINARY DATA 

October 1987 



Figure 6. The four-port register file 


READ/WRITE CONTROL 

Two other fields in the instruction word affect the op¬ 
eration of the register file. 

1. The Owen- bit controls writing of results into the C 
port. When it is active (low), the result data is writ¬ 
ten on the fourth cycle of the operation. When 
writes are disabled, the contents of the register speci¬ 
fied by Cadd remain unchanged. 

Register writes may be disabled either to direct a re¬ 
sult to a Temporary Register or to allow arithmetic 
comparisons to modify the Status and Condition 
Registers without overwriting the contents of a 
general-purpose register. 

2. The lOCtu.o bits control the direction of D port 
transfers (see page 20 for details). If the C port and 
the D port attempt to write to the same register file 
location on the same cycle the register contents are 
left undefined. 
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Multiplier/Accumulator 


The WTL 3132 and WTL 3332 each have a pipelined mulator input and output ports can transfer 32-bit data 
multiplier/accumulator. These consist of a floating values. Figures 7 and 8 show how operations are 
point multiplier whose output is fed into a floating point pipelined through the multiplier/accumulator. 

ALU (Arithmetic and Logic Unit). All multiplier/accu- 



Figure 7. Simple example of multiplier/accumulator timing. 
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MULTIPLIER 

The multiplier has two input ports, MAin and MBin. It 
has one output which can only be connected to the 
AAin port of the ALU. In the first cycle of an opera¬ 
tion, operands are transferred from the register file and 
fed into MAin and MBin. The multiplication is com¬ 
pleted on the second cycle. The intermediate result 
may be negated before it is passed to the ALU. 

ALU 

The ALU has two input ports, AAin and ABin. It has 
one output which is normally connected to the C bus. 
AAin may be connected to the multiplier’s output, so 
that its result is fed into the ALU. Another operand is 
fed into the ABin port simultaneously. The ALU com¬ 
pletes the function specified by an instruction during its 
third cycle. The final result is rounded and output to 
the C bus to be returned to the register file on the 
fourth cycle. 


LATENCY 

Because the multiplier/accumulator is pipelined, an op¬ 
eration can be initiated every cycle. The result of an 
operation is generated three cycles after it is initiated. 
On the fourth cycle, the result can be returned to the 
register file or fed straight back into the multiplier/ac¬ 
cumulator using the temporary registers or a bypass 
mode (see pages 13 or 17). 
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Multiplier/Accumulator, continued 


FUNCTION SELECTION 

The multiplier/accumulator function is specified by the specifies all of the actions associated with one opera 
3-bit field F 2 ,.o in the instruction word as outlined in tion as it passes through the multiplier/accumulator, 
the function select table (figure 9). A single instruction 


F2 F1 FO 

MNEMONIC 

OPERATION 

DESCRIPTION 

0 0 0 

- 

Miscellaneous 

See figure 10 

0 0 1 

fsubr 

Negate and add 

-AAin + ABin 

0 1 0 

fsub 

Subtract 

AAin - ABin 

0 1 1 

fadd 

Add 

AAin + ABin 

1 0 0 

- 

Reserved 


1 0 1 

fmna 

Multiply, negate and add 

-(MAin X MBin) + ABin 

1 1 0 

fmns 

Multiply, negate and subtract 

-(MAin X MBin) - ABin 

1 1 1 

fmac 

Multiply and accumulate 

(MAin X MBin) + ABin 


Figure 9. Function select field encoding 

When the F 2..0 field is (0, 0, 0) the operation to be performed is specified by the Badd field accordlne to fie- 
ure 10. * ® 


Badd4-0 

MNEMONIC 

OPERATION 

DESCRIPTION 

00000 

fcisr 

Clear Status Register 


00001 

fstsr* 

Read Status Register 


00010 

- 

Reserved 


00011 

fmode 

Load Mode Register 


00100 

fabs 

Absolute Value 

1 AAinI 

00101 

float 

Fixed-to-Float 

integer -+ IEEE 

00110 

fix 

Float-to-Fixed 

IEEE -f integer 

00111 

flut 

Look-up Operation 


01000-11111 

- 

Reserved 


* fstsr instructions 

must have their IOCti..o field set to select a store. 



Figure 10. Miscellaneous function select encoding. 
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Multiplier/Accumulator, continued 

MULTIPLY/ACCUMULATE FUNCTIONS , ALU FUNCTIONS 

The WTL 3132/3332 provide three multiply and accu- The WTL 3132/3332 provide three diadic “ALU only” 

mulate functions, fmac multiplies MAin and MBin and functions, fadd adds AAin to ABin. fsub subtracts ABin 

then adds ABin. fmns multiplies MAin and MBin, ne- from AAin. fsubr subtracts AAin from ABin. 
gates the result and then subtracts ABin. fmna multi¬ 
plies MAin and MBin, negates the result and then adds determines whether the multiplier is by- 

ABin. passed and the ALU’s input staged directly into the 

ALU (see figure 11). 

These functions are triadic (they have three input op¬ 
erands). If an IEEE multiply operation is required, the These functions operate in the same number of cycles 
constant 0.0 should be selected as the ABin input. multiply and accumulate functions. This simpli¬ 

fies the programmer’s model; every operation has the 
same latency. 
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Multiplier/Accumulator, continued 

MISCELLANEOUS FUNCTIONS 

If the function field is equal to zero, then a miscellane¬ 
ous ALU function will be selected according to the 

contents of the instruction’s Badd field (see figure 10). 

1. flut is monadic (that is, it has a single input oper¬ 
and). It takes the value on the A bus as its operand 
and it returns an approximation to the inverse of this 
value onto the C bus on its fourth cycle, flut does not 
attempt to modify the Status, Condition or Zero 
Registers. It is recommended that the Abina.,o field 
be set to select the constant 0.0. (See page 38.) 

2. fix is a monadic “ALU only” function. It takes a 
single-precision IEEE format floating point value on 
the A bus as its operand and returns a 24-bit, sign 
extended, two’s complement integer onto the C bus 
on its fourth cycle, fix does not attempt to modify the 
Status or Zero Registers. It is the only instruction 
that produces an integer result. It is recommended 
that the Abin 2 ..o field be set to select the constant 
0.0. (See page 40.) 

3. float is a monadic “ALU only” function. It takes a 
24-bit, sign extended, two’s complement value on 
the A bus as its operand and returns a single-preci¬ 
sion IEEE format floating point number onto the C 
bus on its fourth cycle, float does not attempt to 
modify the Status or Zero Registers. It is the only 
instruction that requires an integer operand. It is 
recommended that the Abin 2 ..o field be set to select 
the constant 0.0. (See page 40.) 

4. fabs is a monadic “ALU only” function. It takes the 
value on the A bus as its operand and returns its 
absolute value onto the C bus on its fourth cycle, 
fabs does not attempt to modify the Status Register. 
The Zero and Condition Registers are modified ac¬ 
cording to the result unless Encni,.o= (0,0). As with 
the diadic "ALU only" functions, it wiil clamp de- 
normalized operands to zero and NaNs to infinity. 
The Abina. .0 field must be set to select the con¬ 
stant 0.0. 


5. fmode loads the desired operating modes into the 
Mode Register (see page 35). Because this operation 
changes the timing of many operations, the results of 
the next three operations should be discarded. 

6. fstsr copies the contents of the Status Register to the 
X port. It has the same timing as the other fstore 
operations. (See page 26.) It is recommended that 
the Cwen- bit be set to prevent register writes, the 
Encni..o field to disable updates of the FPCN pin, the 
IOCti..o field to an fstore and that the result sent to 
the register file on its fourth cycle be discarded, fstsr 
ignores the register address fields. 

7. fclsr clears the contents of the Status Register to 
zero. (See page 26.) It is recommended that the 
Cwen- bit be set to prevent register writes, the 
Encni..o field to disable updates of the FPCN pin, the 
IOCti..o to an I/O nop and that the result sent to the 
register file on its fourth cycle be discarded, fclsr 
ignores the register address fields. 

MULTIPLIER/ACCUMULATOR NOP 

The WTL 3132/3332 do not have a dedicated 
nop instruction. WEITEK software tools use 
fsub .fO, .fO, .fO with the Cwen- bit set to disable reg¬ 
ister writes and the Encni..o field cleared to disable 
FPCN updates. This choice of nop causes no state 
changes. 
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Temporary Registers 

The WTL 3132 and WTL 3332 each include three Figures 12 and 13 show how the Tregs are used to 

32-bit temporary registers (Tregs). They allow values feedback operands to the multiplier/accumulator: fig- 

to be recirculated to the ALU without passing through ure 15 gives an example of a code sequence that does 
the general-purpose register file. The Tregs are often this, 
used as accumulators during successive multiply/accu¬ 
mulate operations. They make it possible to perform a 
calculation of the form x = x (yXz) every cycle. 




Note; For clarity, many key features have been omitted from this diagram (see pages 5 and 6 for more detail). 


Figure 12. Use of temporary registers 
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Temporary Registers, continued 



WRITING TEMPORARY REGISTERS 

The instruction word contains a two-bit field, Adsti..o, 
that determines the destination of the ALU output (see 
figure 14). 

The output of the ALU is always sent to the C bus. If 
no Treg is selected by the Adstu.o field, then the result 
is returned only to the register selected by this instruc¬ 
tion’s Cadd field on its fourth cycle. 


Adst1-0 

RESULT DESTINATION 

00 

TregS, C bus 

01 

Treg2, C bus 

10 

Tregl, C bus 

11 

0 bus 


If the Adsti.,o field selects a Treg in addition to the C 
bus, it is loaded with the result on the fourth cycle of 
an operation, just as the Cadd register write occurs. On 
the next cycle, the contents of the Treg may be input 
directly to the ABin port. This is illustrated by the ex¬ 
ample shown in figure 15. 

The Gwen- bit of the instruction that writes to the Treg 
may be held high to prevent the write to the Cadd reg¬ 
ister. This increases the number of available general- 
purpose registers. 


Figure 14. ALU destination select field encoding 
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READING TEMPORARY REGISTERS 

The instruction word contains a three-bit field, 
Abin 2 ..o, that determines the source of the MAG’s 
ABin input. 

Three encodings select one of the Tregs (see fig¬ 
ure 17). If a Treg is selected, it is copied to the ABin 
port on the second cycle of the operation. 

The Treg may be be read on the cycle after it was writ¬ 
ten. In figure 15, for example, .t1 gets read during the 
second cycle of op #4. This is the fifth cycle of op #1. 


op #1 

fmac 

.fO, 

.f1. 

0, 

.t1 


op #2 

fmao 

.f3. 

.f4. 

.t1, 

.f5 

—old .t1 

op #3 

fmac 

.f6. 

.f7. 

.t1, 

.f8 

—Illegal 

op #4 

fmac 

.f9, 

.flO, 

.t1, 

.f11 

—new .t1 


Notes; 

A Treg cannot be both written and read In the same cycle. 
Op #3 must never attempt to read the Treg written by op 
# 1 . 

To make this code Interruptable, op #2 and op #3 should 
not specify .t1 as an operand. 

Full details of this syntax are given on page 32. 

Figure 15. Use of temporary registers 
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Internal Data Routing 


INTERNAL BUSES 

The three main internal buses, A, B and C, move data 
between the major functional units of the WTL 3132 
and WTL 3332. Each of these buses is 32 bits wide and 
can carry one word per cycle. 

The A bus usually carries operands from the register 
file to the multiplier/accumulator or the divide look-up 
table. It may also be fed with operands directly from 
the X port if the Input Bypass Mode is enabled (see 
page 21). 

The B bus usually carries operands from the register 
file to the multiplier/accumulator. It may also be fed 
-directly with operands from the Y port on the WTL 
3332 if selected by the Mbs- bit (see page 23). 

The C bus usually carries multiplier/accumulator or flut 
results back to the general-purpose register file. If the 
Output Bypass Mode is enabled, then it may be used to 
feed the results directly to the X port. 

If the C bus is not needed to carry the results back to 
the registers (eg., when they are output directly to the 
Z port), then its usual direction of transfer may be re¬ 
versed. It is then used to carry inputs from the X port 
directly to the multiplier/accumulator. This is only pos¬ 
sible by the use of the floadrc operation or the Double- 
Pump Mode (see page 21). 


Mbin- 

INPUT 

0 

B bus 


1 

C bus 



Figure 16. Multiplier input port select field encoding 


Abin2-0 

INPUT 

000 

C bus 

001 

B bus 

010 

Treg2 

oil 

Tregl 

100 

Treg3 

101 

Reserved 

110 

2.0 

111 

0.0 


Figure 17. ALU input select field encoding 


MULTIPLIER/ACCUMULATOR INPUT PORTS 

The multiplier/accumulator has four input ports, MAin, 
MBin, AAin and ABin. These ports can each receive a 
32-bit word per cycle. They all have multiplexers which 
may be connected to various input sources. The possi¬ 
ble selections for each port are described below: 

1. MAin usually obtains input from the A bus. Results 
may be fed from the C bus to the A bus using the 
Internal Bypass Mode and then into MAin. 

2. MBin usually obtains its input from the B bus. Results 
may be fed from the C bus to the B bus using the 
Internal Bypass Mode and then into MBin. Y port 
inputs may be enabled onto the B bus and then into 
the MBin. 

Alternatively, the C bus can be reversed so that it is 
carrying inputs from the X port to the MBin port. 
This prevents the C bus from being used to return 
results to the register file and is done in conjunction 
with the floadrc operation or Double-Pump Mode. 
MBin must select the C bus (see figure 16). 

3. AAin usually obtains its input from the A bus. Results 
may be fed from the C bus to the A bus using the 
Internal Bypass Mode and then into AAin. 

4. ABin usually obtains its input from the B bus. Results 
may be fed from the C bus to the B bus using the 
Internal Bypass Mode and then into ABin. Y port 
inputs may be enabled onto the B bus and then into 
the ABin. 

The 3-bit Abin 2 ..o field in the instruction word se¬ 
lects between input from the B bus (as above), input 
of the constants 0.0 or 2.0, input from one of the 
Tregs, or input from the C bus (as below) according 
to figure 17. 

Alternatively, the C bus can be reversed so that it is 
carrying inputs from the X port to the MBin port. 
This prevents the C bus from being used to return 
results to the register file and is done in conjunction 
with either the floadrc operation or Double-Pump 
Mode. ABin must select the C bus. When this path¬ 
way is used, the ABin data is not delayed: an exter¬ 
nal register may be needed to synchronize the X and 
Y inputs. 
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Internal Data Routing, continued 

If the function code specifies an operation that uses the ports automatically. The WTL 3132/3332 are designed 

multiplier, it directs data to the MAin or MBin ports. If to maintain a consistent latency regardless of the type 

the multiplier is not used, (that is, in “ALU only” op- of operation, 
erations), then the data is sent to the AAin or ABin 

INTERNAL BYPASS MODE 



Note: For clarity, many key features have been omitted from this diagram (see pages 5 and 6 for more detail). 
Figure 18. Internal bypass routes 
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Internal Data Routing, continued 


A code sequence may often specify the result of one 
operation to be the operand of a subsequent operation. 
The WTL 3132/3332 provide several methods of 
achieving this: 

1. Internal Bypass Mode disabled 

The default mode (MO = 0 and Mil = 0) of opera¬ 
tion is to send the result back to one of the general- 
purpose registers and then read the new value of this 
register as the operand. 

Figure 19 shows this code sequence: op #1 specifies 
.f3 as its destination and op #2-5 all specify .f3 as 
one of their operands. Because the result of op #1 
gets written back to .f3 on its fourth cycle, op #5 is 
the first of the succeeding operations to read the new 
contents of .f3 as desired. This represents a register- 
to-register latency of four cycles. 


op #1 

fadd 

.fO, 

.f1, 

.f2 


op #2 

fadd 

.f2. 

.f3. 

.f4 

-old .f2 

op #3 

fadd 

.f2. 

.f5. 

.f6 

-old .f2 

op #4 

fadd 

.f2. 

.f7. 

.f8 

-old .f2 

op #5 

fadd 

.f2. 

.f9. 

.flO 

—new.f2 


Notes: 

To make this code Interruptable, op #2, op #3 and op #4 
should not specify .f2 as an operand. 

Full details of this syntax are given on page 32. 

Figure 19. No internal bypassing 

2. Internal Bypass Mode enabled 

If the Internal Bypass Mode is enabled (MO = 1 and 
Mil = 1) then the register-to-register latency is re¬ 
duced to just three cycles. Figure 18 shows the two 
internal bypass multiplexers that allow results on the 
C bus to be copied over to the A or B buses without 
first being returned to the register file. 

Figure 20 shows a code sequence that uses the by¬ 
pass mode: op #1 specifies .f2 as its destination and 
op #2-4 all specify .f2 as one of their operands. In 
contrast to the previous example, op #4 is fed the 
result of op #1 on the same cycle that the result is 
written to the register file. 


The multiplexers operate by comparing the Aadd 
and Badd address fields to the Cadd address field as 
they are presented to the register file on each cycle. 
If Aadd = Cadd, then the value on the C bus is cop¬ 
ied to the A bus; and if Badd = Cadd, then the value 
on the C bus is copied to the B bus. Enough time 
remains for that value to be latched into a multiplier/ 
accumulator input port before the end of the cycle. 
In the example, the Cadd field of op #1 matches the 
Aadd field of op #4 as they are compared on the 
fourth cycle of op #1, the bypass from C to A buses 
is opened and op #4 can proceed immediately with 
the new data. During the same cycle, the result is 
copied into the Cadd register as usual so that the 
register file remains consistent with the data values in 
use. 

Setting mode bit MO = 1 enables the C-to-A bus by¬ 
pass and setting mode bit MU = 1 enables the C- 
to-B bus bypass. If the Cwen- bit of an instruction is 
set to prevent register writes, then the Internal By¬ 
pass Mode is temporarily suspended on the fourth 
cycle of that operation; this insures that the register 
file contents are kept in step with the operands used 
by each instruction. Similarly, the NEUT-, STALL-, 
and ABORT- signals cause the Internal Bypass Mode 
to be suspended as they cancel the register write of 
an instruction. 


op #1 

fadd 

.fO, 

.f1. 

,f2 



op #2 

fadd 

.f2. 

.f3. 

.f4 

—old 

.f2 

op #3 

fadd 

.f2. 

.f5. 

.f6 

—old 

.f2 

op #4 

fadd 

.f2. 

.f7. 

,f8 

—new 

’ .f2 


Notes; 

To make this code Interruptable, op #2 and op #3 should not 
specify .f2 as an operand. 

Full details of this syntax are given on page 32, 

Figure 20. Use of internal bypassing 
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3. Temporary Registers 

To complete the range of alternative routes available 
for feeding results back to the multiplier/accumula¬ 
tor, the Treg code example first given in figure 15 is 
repeated here. 

The temporary register option has the same timing as 
the Internal Bypass Mode, but only feeds back to 
the ABin port. The Tregs bring an extra source of 
operands to the multiplier/accumulator, allowing op¬ 
erations of the form = x it (yXz) to be executed in 
a single cycle. 

Many of the code examples given read a register after 
the instruction that modifies it has been initiated. 
While such code sequences are valid, they are uninter- 
ruptable. More detailed coverage of interruptable code 
may be found on page 33. 


op #1 

fmac 

.fO, 

.f1, 

0, 

.t1 


op #2 

fmac 

.f2, 

.f3. 


.f4 

—old .t1 

op #3 

fmac 

.f5, 

.f6. 

.t1, 

.f7 

—Illegal 

op #4 

fmac 

.f8. 

.f9. 

.t1, 

.flO 

© 

C 

I 


Notes; 

A Treg cannot be both written and read In the same cycle. 
Op #3 must never attempt to read the Treg written by op 
# 1 . 

To make this code Interruptable, op #2 and op #3 should 
not specify .t1 as an operand. 

Full details of this syntax are given on page 32. 

Figure 22. Use of temporary registers 
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Input/output 

The WTL 3132 and WTL 3332 provide different input/ 
output facilities. The WTL 3332 has additional Y and Z 
ports to increase the bandwidth between the multiplier/ 
accumulator and external components. Sections spe¬ 
cific to the WTL 3332 will state this in their headings. 


All data I/O ports are 32 bits wide. They can all trans¬ 
fer at least one word on each cycle. The memory-to- 
memory latency can be as low as five cycles (two more 
than the register-to-register latency). All output buses 
may be disabled by de-assertlng their asynchronous 
output enable signals. 


THE X PORT: NORMAL USAGE 
(WTL 3132 AND WTL 3332) 



The X port normally transfers data to and from the 
register file D port. The Dadd field selects the register 
in question and the IOCti..o field in the instruction 
word controls the transaction. These I/O transfers al¬ 
ways begin during the first cycle of an operation. See 
figure 24 for the IOCti..o encoding scheme and fig¬ 
ure 23 for normal I/O timing. 

1. The fload operation loads the value at the X port 
pins into the register selected by Dadd. It is com¬ 
pleted by the end of the first cycle. If the register is 
read on the same cycle, its previous contents will be 
output. 

2. The fstore operation stores the contents of the regis¬ 
ter selected by Dadd to the X port output register 
(XDoutR) during the first cycle. This value is driven 
onto the X port pins on the second cycle. The fstore 
operation drives the X pads during most of its second 
cycle and at the start of its third cycle. Input data 
may not be applied to the pins until partway through 
its third cycle. The OEX- pin can asynchronously 
disable the output at any time. 

3. The floadrc operation is described on page 22. 


4. The I/O nop operation simply disables the X port 
and ignores any input. It does not prevent the multi¬ 
plier/accumulator from writing to registers or modify¬ 
ing the state of the condition and exception outputs. 

Note: An fload should not follow a fstore immediately. 
At least two I/O nop cycles must be inserted between 
them if the Coprocessor Load Mode is disabled, and at 
least one I/O nop if it is enabled (see page 27). 


loot 1-0 

OPERATION 

00 

I/O nop 

01 

floadrc 

10 

fstore 

11 

fload 


Figure 24. X port I/O control field encoding 
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Input/output, continued 



To inputs selecting “C Bus" 


Note: The C bus can be used either for X port Inputs or for results, but not both. 


Figure 25. Input/Output Bypass diagram (X port) 


INPUT BYPASS MODE 
(WTL 3132 AND WTL 3332) 

During a normal fload operation the X port data must 
be written to a register one cycle before it can be used 
as the operand of a subsequent operation. If, however, 
the Input Bypass Mode is enabled and Aadd = Dadd; 
the value at the X port is loaded into both the register 


selected by Dadd and onto the A bus during the first 
cycle. 

The Input Bypass Mode is enabled when bit M3 of the 
Mode Register is set to 1 (see page'35). 
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Input/output, continued 



1 


2 


3 


4 


5 




^MULTIPLY2^MULTIPLY3^ — 

ALU1 ALU2 ALLI3 


-1^ RESULT1 ^ RESULT2^ RESULTS ^ 

I ' I I ' I \ ' I \ 


-j( ^ RESULT1 ^ RESULtr^ RESULTS ^ 


Figure 26. X Port I/O timing (Input and Output Bypass Modes enabled; M3 = 1 and M4 = 1) 


The MAin port always receives the input data by the 
end of the first cycle. If the function is “ALU only”, 
the data is staged into the AAin port during the second 
cycle. 

The Input Bypass Mode should not be enabled when 
the Coprocessor Load Mode is enabled. 


data by the end of the first cycle. Note that the two 
ports are skewed by one cycle from the programmer's 
view because the ABin input is not delayed. 

floadrc should not be used in an interruptable environ¬ 
ment. 


FLOADRC OPERATION 
(WTL 3132 AND WTL 3332) 

During a normal fload operation the X port data is cop¬ 
ied to the D port, and, if the Input Bypass Mode is 
activated, the A bus. If the floadrc operation is used 
instead, the value at the X port is loaded both into the 
register selected by Dadd and onto the C bus during 
the first cycle, where it can be selected by one of the 
multiplier/accumulator input ports, MBin or ABin. 

floadrc prevents the multiplier/accumulator from writ¬ 
ing the result of a previous instruction to the Cadd reg¬ 
ister, and the resulting contents of this register are un¬ 
defined. Results from the divide look-up table preempt 
the floadrc operation and are written to the Cadd reg¬ 
ister as usual; floadrc still copies the X port value to the 
Dadd register. The Cwen- control can be set to pre¬ 
vent unwanted writes to the register specified by the 
Cadd field of an instruction. 

If the MBin multiplexer selects the C bus it will receive 
the X port data by the end of the first cycle. If the ABin 
multiplexer selects the C bus it will receive the X port 


OUTPUT BYPASS MODE 
(WTL 3132 AND WTL 3332) 

During a normal fstore operation the multiplier/accu¬ 
mulator result must be written to a register one cycle 
before it can be output to the X port. If, however, the 
Output Bypass Mode is enabled and Cadd = Dadd; the 
multiplier/accumulaior result is sent to both the register 
selected by Cadd and the X port output register 
(XCoutR) on the same cycle. 

The Output Bypass Mode is enabled when bit M4 of 
the Mode Register is set (see page 35). 

The fstore instruction that specifies the Dadd must 
start execution on the fourth cycle of the arithmetic 
instruction that specified the Cadd. The output appears 
at the X port pins during the second cycle of the fstore 
instruction. (See figure 26.) 

If the Cwen- bit of an instruction is set to prevent reg¬ 
ister writes, then the Output Bypass Mode is temporar¬ 
ily suspended on the fourth cycle of that operation. 
This insures that the register file contents are kept in 
step with the output data. 
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Input/Output, continued 

THE Y PORT: NORMAL USAGE 
(WTL 3332 ONLY) 

The Y port is an input-only port. The Mbs- bit in the 
instruction word controls whether the B register file 
port or the Y input port drives the B bus (see fig¬ 
ure 28). If the B port is selected, then the data input at 
the Y port is discarded. 

If the MBin port then selects the B bus as input, the 
multiplier will receive the Y port data by the end of the 
first cycle. If the ABin port selects the B bus as input, 
the ALU will receive the Y port data at the end of the 
next cycle. 

Two external data streams may be fed into the multi¬ 
plier/accumulator through the X and Y ports without 
being written to the register file. The Y port data can be 
fed into the MBin or ABin inputs via the B bus. Mean¬ 
while, the X port data can be fed to the MAin or AAin 
inputs via the A bus using the Input Bypass Mode. Al¬ 
ternatively, the X port data may be fed into the C bus 
with the floadrc operation. 



Mbs- 

B BUS INPUT SOURCE 

0 

Register file (B Port) 

1 

External input (Y Port) 


Figure 28. Y port input select field encoding 


Y LATE INPUT MODE (WTL 3332 ONLY) 

The Y port data may be sampled on the falling edge of 
CLK instead of its rising edge. This mode is selected by 
setting bit M12 of the Mode Register to 1 (see page 
35). It reduces the delay through the YinR register so 
that the data still arrives at the multiplier/accumulator 
ports before the end of the first cycle. 

The Y port Late Input Mode can be used to demul¬ 
tiplex a high-bandwidth input bus onto the X and Y 
ports. 



THE Z PORT: NORMAL USAGE 
(WTL 3332 ONLY) 

The 2 port is an output-only port. It outputs the result 
of a multiply/accumulate operation directly (see fig¬ 
ure 30). The result of a flut operation cannot be output 
to the Z port. 

The Z port output register (ZoutR) is loaded on the 
fourth cycle of an operation; just when the Cadd regis¬ 
ter should be written. This transfer occurs without re¬ 
gard to any other activity on the C bus. 

The Z port output register then drives this value onto 
the Z port pins. The output occurs during the fifth cy¬ 
cle of the instruction. The Z port output register is up¬ 
dated on every cycle, even if the multiplier/accumula¬ 
tor result is meaningless. The OEZ- pin can 
asynchronously float the output at any time. 
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DOUBLE-PUMP MODE 
(\VTL 3332 ONLY) 

The WTL 3332 multiplier/accumulator has four ports. 
It can be fed an operand through each of the MAin, 
MBin and ABin inputs and returns a result via the ALU 
output on every cycle. Only three external ports are 
provided, X, Y and Z. Double-Pump Mode allows two 
inputs to be made via the X port on each cycle; all four 
multiplier/accumulator ports may then be serviced by 
the external ports. 

Double-Pump Mode is enabled by setting bit M9 of the 
Mode Register to 1 (see page 35). If Double-Pump 
Mode is enabled, the lOCtu.o field in the instruction 
word must constantly be set to fload. 

The first input is latched into the X port on the rising 
edge of CLK at the beginning of the cycle, as usual. It is 
written into the Dadd register by the end of the cycle. 
If the Input Bypass Mode (see above) is enabled and 
Aadd = Dadd, then this value is also driven onto the A 
bus and can be latched by the MAin port by the end of 
the first cycle or the AAln port by the end of the sec¬ 
ond cycle. 

The second input is latched into the X port on the next 
falling edge of CLK. It is driven onto the C bus and 
may be latched by the MBin port or the ABin port by 
the end of the first cycle. 


The MBin or ABin port may also receive input data 
from the Y port via the B bus allowing all three multi¬ 
plier/accumulator inputs to be fed simultaneously. 

The multiplier/accumulator result should be directed to 
the Z port. 
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System Interfacing 

Certain signals on the WTL 3132 and WTL 3332 are 
provided to communicate control information to and 
from the other parts of a system. 

Two outputs, FPCN and ZERO, indicate the condition 
of an operation. They can be sent to a sequencer to 
control instruction branching. 

One output, FPEX, signals the occurrence of arithmetic 
overflow. It can be used to interrupt a host processor to 
request corrective action. 

Three inputs, NEUT-, STALL- and ABORT-, allow the 
effects of instructions fed into the C port to be can¬ 
celed. They can be used to make the WTL 3132/3332 
respond correctly to page faults, interrupts or other sys¬ 
tem requests. 

CONDITION AND ZERO 

The WTL 3132/3332 have a Condition Register and a 
Zero Register. The multiplier/accumulator attempts to 
modify the contents of these registers on every cycle. 

The instruction word includes a two-bit condition select 
field, Encnn.o, which selectively allows the multiplier/ 
accumulator to succeed in updating the contents of 


these registers, If both bits are cleared, then the previ¬ 
ous state of the registers remains unchanged. 

Most functions update the Condition Register accord¬ 
ing to the sign and magnitude of their result. Miscella¬ 
neous functions may set the register for other reasons 
(see figure 32). 

Encni..odetermines the exact condition that will set the 
Condition Register for each instruction. This allows any 
of the common comparisons (>, >, =, <, <) to be 
made in one operation. Figure 33 gives the bit encod¬ 
ing. 

If the result of an operation is exactly equal to zero and 
the Encnn.o field is not (0,0); the Zero Register is set 
to 1. If the result is not zero and Encni..o is not (0,0); 
the Zero Register is cleared to 0. If Encnn.o is (0,0); 
the contents of the Zero Register remain the same. 

The contents of the Zero and Condition Registers are 
copied to the ZERO and FPCN outputs respectively on 
the fourth cycle of the operation, just as the general- 
purpose register file write occurs. Bypassing does not 
affect the timing of these signals. These outputs always 
drive a logic 0 or 1. 


FUNCTION 

SET CONDITION REGISTER 

SET ZERO REGISTER 

SET STATUS REGISTER 

fmac 

N<0, N<0, N = 0 

N = 0 

exp (N) > 255 

fmns 

N<0, N<0, N = 0 

N = 0 

exp (N) > 255 

fmna 

N<0, N<0, N = 0 

N = 0 

exp (N) > 255 

fadd 

N<0, N<0, N = 0 

N = 0 

exp (N) > 255 

fsub 

N<0, N<0, N = 0 

N = 0 

exp (N) > 255 

fsubr 

N<0, N<0, N = 0 

N = 0 

exp (N) > 255 

flut 

- 

- 

- 

fabs 

- 

— t 

- 

fix 

1 M 1 >2^^ 

- 

- 

float 

M < (-2^3) or M > (2 ” -1) 

- 

- 

fnop 

- 

- 

- 

N Is the result of an operation; M Is the operand. 



Note: fix and float only test for operand range excess when Ml = 1. 



Figure 32. Effect of functions on Condition, Zero, and Status Register 


25 


© Copyright WEITEK 1987 
All Rights Reserved 











UEITEK CORP Oa »E^ TbbBflEb OaDlDEl E d 7^^^- JZ’6S 


System Interfacing, continued 


Encn1 

EncnO 

SET ZERO REGISTER 

SET CONDITION REGISTER 

0 

0 

- 

_ 

0 

1 

N=0 

N < 0 

1 

0 

N=0 

N<0 

1 

1 

N=0 

N=0 

Note: Encnt and EnonO must be zero for fnop. 


Condition Register Is set by fix or float when Encni..o Is (0,1) and the operand exceeds the permitted range. 


Figure 33. Encni..o encoding 


STATUS AND EXCEPTIONS 

The WTL 3132/3332 have a Status Register. If an op¬ 
eration produces a result that is too large to be repre¬ 
sented in the IEEE single-precision floating point for¬ 
mat the multiplier/accumulator attempts to set the 
Status Register to 1. 

The Mode Register includes an exception control bit, 
M5 (see page 35). If M5 is set to 1 and an overflow 
occurs, the Status Register is set to 1. If M5 is cleared 
to 0, the Status Register is cleared to 0. If M5 is subse¬ 
quently set to re-enable overflows, the Status Register 
will contain 0. 

The contents of the Status Register are copied to the 
FPEX output on the fourth cycle of the operation, just 
as the register file write occurs. 

One bit in the Mode Register, M8, selects the polarity 
of FPEX. If it is set to 1, then FPEX is active high. If it 
is cleared to 0, then FPEX is active low. If M8 is 
cleared, the Status Register is “sticky”; once set it will 
remain so until an fcisr operation is performed. 

Two miscellaneous functions, fstsr and fcisr, allow the 
Status Register to be read at the X port or for it to be 


cleared. They take effect during their first cycle. If the 
fstsr operation is performed, then the IOCtt..o field 
must specify a ‘store’ and its timing is the same as a 
normal store operation (see figure 34). Only the least- 
significant bit of the Status Register is guaranteed; the 
other 31 bits should be masked off when it is read. If 



Figure 34. fstsr timing 


CLK 



CODE PORT 

FPCN/ZERO 

FPEX 



Figure 34a. Condition/Status timing 


j ^OND/STATI^^^D/STAT^OND/STAT^ 
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System Interfacing, continued 


COPROCESSOR LOAD MODE 


The Coprocessor Load Mode is provided to support 
systems which generate a data address at the beginning 
of a cycle and need to latch the data word into the 
WTL 3132/3332 later in the same cycle. 

If bit M6 of the Mode Register is set, then the data 
applied to the X port is not sampled until late in the 
first cycle. Time still remains to write the Dadd register 
before the end of this cycle. As usual, the next instruc¬ 
tion can use this data value as one of its operands. 


The number of I/O nops that must be inserted between 
an fstore and a subsequent fload is reduced from two 
to one. 

The STALL- signal timing is modified internally if the 
Coprocessor Load Mode is enabled so that it still can¬ 
cels the fload and fstore operations. 

This mode must be selected if the WTL 3132 is to be 
used in the XL environment (see Appendix A). 


If this mode is used, neither the Double-pump Mode 
nor the X port Input Bypass Modes may be enabled. 



NEUT-, STALL- AND ABORT- 


These three inputs allow a system to modify the effect 
of certain instructions dynamically. 

1. NEUT- 

Neutralize is used to prevent the execution of in¬ 
structions in the shadow of a delayed branch opera¬ 
tion or during an interrupt service cycle. 

In a system where the sequencer supports delayed 
branching, it will present the next instruction to the 


C port as it decides whether to take a branch. If the 
branch is taken, this instruction must be cancelled 
before it has any effect on the state of the system. 
Similarly, if an interrupt occurs, the instruction due 
to be executed can be cancelled in order to branch 
to an interrupt service routine. The cancelled in¬ 
struction is resubmitted for execution on return from 
interrupt. 



Figure 36. NEUT- timing 
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System Interfacing, continued 


The neutralize signal cancels the effect of the cur¬ 
rent instruction. It prevents the result of this instruc¬ 
tion from being written into the register file or tem¬ 
porary registers. It has no effect on fload or fstore 
operations. This signal is sampled on the rising edge 
of CLK after the current instruction was fed into the 
C port. 

2. STALL- 

STALL- is used to hold off execution until a valid 
code word is present when the code word is delayed 
(as in a code memory refresh cycle) or absent (as in 
a page fault). The next operation can be continually 


stalled until the correct instruction word is presented 
to the C port. 

The STALL- signal cancels the effect of the next in¬ 
struction. It prevents the result of this instruction 
from being written into the register file or temporary 
registers. It also cancels fload and fstore operations. 
This signal is sampled at the same time as the next 
instruction is fed into the C port. 

If Coprocessor Load Mode is enabled, the timing of 
STALL- is modified internally to maintain its usual 
effect. The fmode and fcisr instructions are also 
properly stalled. 


CLK 

CODE PORT 

X PADS 
STALL- 

REGISTER FILE 



Figure 37. STALL- timing (including fload) 



Figure 38. STALL- timing (including fstore) 
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System Interfacing, continued 

3. ABORT- 

ABORT- is used to cancel the current and next in¬ 
structions when the data word is delayed (as in a 
cache miss) or absent (as in a page fault). If the two 
cancelled instructions are subsequently resubmitted 
for execution the processor will behave as if no inter¬ 
rupt occurred. 

The ABORT- signal cancels the effect of both the 
current and next instructions. It prevents the results 
of these instructions from being written into the reg¬ 
ister file or temporary registers. It also cancels the 
I/O operation specified in the next instruction, but 
not in the current instruction. This signal is sampled 
at the same time as the next instruction is fed into 
the C port. 



All three of these signals are delayed internally when 
necessary: for example, the Cadd register write is only 
disabled on the instruction’s fourth cycle. They prevent 
the cancelled instructions from modifying the Condi¬ 
tion or Status Registers, thus preventing the state of the 
FPCN, FPEX and ZERO outputs from changing. 
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C BIT 

FIELD 

OPERATION 

0 

EncnO 

Condition Output Select 

1 

Encnl 


2 

Mbin- 

MBin Input Select 

3 

AdstO 

ALU Destination Select 

4 

Adsti 


5 

AbInO 

ABin Input Select 

6 

Abinl 


7 

Abln2 


8 

DaddO 

D Port Register Address 

9 

Daddi 


10 

Dadd2 


11 

Dadd3 


12 

Dadd4 


13 

lOCtO 

I/O Oontrol 

14 

lOCtI 


15 

Cwen- 

0 Port Write Enabie 

16 

CaddO 

0 Port Register Address 

17 

Gaddi 


18 

Cadd2 


19 

Cadd3 


20 

Cadd4 


21 

BaddO 

B Port Register Address 

22 

Baddi 


23 

Badd2 


24 

Badd3 


25 

Badd4 


26 

AaddO 

A Port Register Address 

27 

Aaddi 


28 

Aadd2 


29 

Aadd3 


30 

Aadd4 


31 

FO 

Function Code 

32 

FI 


33 

F2 


34 

Mbs- 

Y Port Input Select* 

*Only on WTL 3332 


Figure 40. Instruction format 
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Instruction Set, continued 

FORMAT 

The WTL 3132 has a 34-bit instruction word. The 
WTL 3332 has a 35-bit instruction word because it re¬ 
quires the extra Mbs- bit to control the Y port input. 
Refer to figure 40 for the location of each field in the 
instruction word. 

1. Mbs- bit. (WTL 3332 only) 

B bus input control bit. May be register file port B or 
external input port Y. 

2. F field. 

Function control (F 2 .. 0 ) field. Selects function to be 
performed on this instruction’s operands. 

3. Aadd field. 

A register address (Aadd4..o) field. Selects location 
in general-purpose register file to be read out via the 
A port. 

4. Badd field. 

B register address (Badd4..o) field. Selects location 
in general-purpose register file to be read out via the 
B port. Also encodes miscellaneous functions that 
require only one operand. 

5. Cadd field. 

C register address (Cadd4,,o) field. Selects location 
in general-purpose register file to be written into via 
the C port. 

6. Cwen- bit. 

C port write enable bit. Active low. 


7. lOCt field. 

I/O control (IOCti,.o) field. Selects type of I/O trans¬ 
fer performed via D port. 

8. Dadd field. 

D register address (Dadd4..o) field. Selects location 
in general-purpose register file to be read out or writ¬ 
ten into via the D port. 

9. Abin field. 

ALU input multiplexer input control (Abina,. 0 ) field. 
Selects the input source for ALU. 

10. Adst field. 

ALU output destination control (Adsti..o) field. Se¬ 
lects output destination for ALU. May be the C bus 
or the C bus and a Temporary Register. 

11. Mbin- bit. 

MBin port input control bit. Selects multiplier input 
source. May be the B bus or C bus. 

12. Encn field. 

Condition select (Encni..o) field. Enables a selec¬ 
table combination of the condition and zero flags 
onto the FPCN output. 

All of the actions specified by these fields are defined 
in the same instruction word. In this way, all of the 
stages of an operation, from supplying its operands to 
storing its results back into a register, are specified to¬ 
gether. 
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Instruction Set, continued 


MNEMONICS 


The mnemonics shown in figures 41 through 43 are 
those used to control the XL-3132 in the XL-Series 
programming environment. They are given here to sim¬ 
plify understanding of the programming model and to 
provide a syntax in which to present the programming 
examples. 

These mnemonics represent a subset of the functions 
available on the WTL 3132/3332. In particular, the In¬ 
put Bypass Mode is disabled and the Output Bypass, 
Internal Bypass, and Coprocessor Load Modes are en¬ 
abled unless otherwise noted. The user is free to en¬ 
hance or disregard this suggested programming model 
according to system requirements. 


I/O SELECTION 

OPERAND 


Source 

Destination 

fload 

X port 

.fO-31 

fstore 

.fO-31 

X port 

Note; If no I/O operation Is specified, an I/O nop will be 
selected In the IOCti..o Instruction field. 


Figure 41. Recommended mnemonics 


OPERAND NOTATiON 

DESCRIPTION 

.fO-31 

Thirty-two, 32-bit general purpose registers 

0 

Constant "0.0" 

2 

Constant "2.0" 

.t1-3 

Three temporary registers 


Figure 42. Recommended mnemonics 


FUNCTION 

OPERAND SELECTIONS 

SOURCE (Aadd) 

, SOURCE (Badd) 

SOURCE (Tregs) 

DESTINATION (Cadd) 

fmac 


.fO-31 

0, 2, or .t1-3 

.fO-31 and/or .t1-3 

fmns 


.fO-31 

0, 2, or .t1-3 

.fO-31 and/or .t1-3 

fmna 

.fO-31 

.fO-31 

0, 2, or .t1-3 

.fO-31 and/or .t1-3 

fadd 

.fO-31 

.fO-31, 0, 2, or .t1-3 


.fO-31 and/or .t1-3 

fsub 

.fO-31 

.fO-31, 0, 2, or .t1-3 


.fO-31 and/or .t1-3 

fsubr 

.fO-31 

.fO-31, 0. 2, or .t1-3 


.fO-31 and/or .t1-3 

fiut 

.fO-31 



.fO-31 

fabs 

.fO-31 



.fO-31 and/or .t1-3 

fix 

.fO-31 



.fO-31 and/or .t1-3 

float 

.fO-31 



.fO-31 and/or .t1-3 

fnop 

- 





Figure 43. Recommended mnemonics 
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Instruction Set, continued 


CODE CONSTRALNTS 

The following set of rules prevents illegal code 
sequences: 

1. All instructions must avoid writing to the Cadd regis¬ 
ter and Dadd register simultaneously. Thus no fload 
operation with Dadd = .fx may start on the fourth 
cycle of an operation with Cadd = .fx. 


op #1 fadd .f?, .f?, .fx 


op tf2 fadd .f?, .f?, .f? 

fload.fx 

op #3 fadd .f?, .f?, .f? 

fload.fx 

op #4 fadd .f?, .f?, ,f? 

fload.fx —Illegal 

op #5 fadd .f?, .f?, .f? 

fload.fx 


Figure 44. 


2. Because the X port output is driven on the cycie 
after an (store operation is specified, an fload can¬ 
not follow an (store immediately. At least one I/O 
nop must intervene if the Coprocessor Load Mode is 
enabled (M6 = 1), or two I/O nops if it is disabled 
(M6 = 0) (see page 27). 


op #1 

fadd 

• f?. 

.f?. 

•f?: 

.(store.f? 

op 

fadd 

.f?. 

.f?. 

■f?: 

.fload.f? —Illegal 

op #3 

fadd 

.f?. 

■ f?. 

.f?: 

.fload.f? 


Figure 45. Coprocessor Load Mode enabled, M6=l 


op #1 fadd .f?, .f?. 

.(?: 

(store.f? 

op #2 fadd .f?, .f?, 

■f?; 

fload.f? —Illegal 

op #3 fadd .f?, .f?. 

•f?: 

fload.f? —Illegal 

op #4 fadd .f?, .(?, 

■f?: 

fload.f? 

Figure 46. Coprocessor Load Mode disabled, M6=0 

3. No temporary register can be written and read on 
the same cycle. Thus no operation that selects .tx as 
an operand register may start on the third cycle of 
an operation with Cadd = .tx. 

op #1 fmac ,(?, .f?, 

0, 

.tx 

op #2 fmac .f?, .f?. 

.tx, 

.f? 

op #3 fmac .f?, .(?, 

.tx. 

.f? —Illegal 

op #4 fmac .(?, .(?, 

.tx. 

.f? 


Figure 47. Coprocessor Load Mode disabled, M6=0 
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Instruction Set, continued 


If code is to be interruptable and respond correctly to 
the NEUT-, STALL- and ABORT- signals, then these 
additional rules must also be followed. They all prevent 
delayed register writes from modifying operand values 
in a time-dependent fashion. 

4. No operation with Aadd or Badd = .fx may start 
after the first cycle and before the fourth cycle of an 
operation with Cadd = .fx. (If the Internal Bypass 
Mode is disabled (MO = 0 and M11 =0), this be¬ 
comes the fifth cycle). 



Figure 48. Internal Bypass Mode enabled, 
MO = 1 and MU = 1 



Figure 49. Internal Bypass Mode disabled, 
MO = 0 and MU = 0 


5. No operation that selects .tx as an operand register 
may start after the first cycle and before the fourth 
cycle of an operation with Cadd = .tx. 


op #1 

fmac 

.f?, 

• f?, 

0. 

.tx 


op #2 

fmac 

.f?. 

■ f?, 

.tx. 

.f? 

—Illegal 

op #3 

fmac 

• f?, 

.f?, 

.tx. 

.f? 

—Illegal 

op #4 

fmac 

• f?. 

.f?. 

.tx. 

.f? 



Figure 50. 


6. No operation with Aadd or Badd = .fx may start on 
the same cycle as an fload where Dadd = .fx. 



7. The NEUT- line does not cancel fload and fstore, so 
when it is used to cancel the effect of an instruction 
in the shadow of a delayed branch operation (as in 
the XL-Series), this instruction should not perform 
I/O transfers. (This is not necessary when NEUT- is 
asserted during an interrupt response cycle because 
the cancelled instruction is resubmitted for execu¬ 
tion.) 

In the examples, the notation .f? is used to indicate 
any register except .fx. 
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Initialization 

MODE REGISTER 

The Mode Register controls which of the special modes previous versions of the WTL 3132/3332, Other bits 
are enabled. Normally, it is initialized to the desired are reserved and should be set to the value specified in 
state and is not subsequently altered. Some mode bits figure 52, 
are provided to maintain backward compatibility with 


MODE BIT 

LOGIC VALUE 

DESCRIPTION 

MO 

0 

Internal Bypass Mode (Aadd = Cadd) disabled 


1 

Internal Bypass Mode (Aadd = Cadd) enabled 

Ml 

0 

fix rounds to negative infinity 


1 

fix rounds to nearest (enable range test) 

M2 

- 

Reserved: should be cleared to 0 

M3 

0 

Input Bypass Mode disabled 


1 

Input Bypass Mode enabled 

M4 

0 

Output Bypass Mode disabled 


1 

Output Bypass Mode enabled 

M5 

0 

FPEX Output disabled 


1 

FPEX Output enabled 

M6 

0 

Co-processor Load Mode disabled 


1 

Co-processor Load Mode enabled 

M7 


Reserved; should be set to 1 

M8 

0 

FPEX active low and “sticky” 


1 

FPEX active high 

M9 

0 

Double-pump Mode disabled 


1 

Double-pump Mode enabled 

M10 

- 

Reserved: should be set to 1 

M11 

0 

Internal Bypass Mode (Badd = Cadd) disabled 


1 

Internal Bypass Mode (Badd = Cadd) enabled 

M12 

0 

Y Late Input Mode disabled 


1 

Y Late Input Mode enabled 


Figure 52. Mode selection table 


35 


© Copyright WHITER 1987 
All Rights Reserved 










Initialization, continued 


The Mode Register is loaded by the fmode operation. as shown by figure 53, fmode completes by the end of 
This causes the Aadd, Cadd and ABin 2 ..o fields in the its first cycle, 
instruction word to be loaded into the Mode Register 



Figure 53. Load Mode Register instruction format 
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Initialization, continued 

Changing the contents of any bits in the Mode Register Some combinations of modes are not allowed, 

will have undefined effects on any currently executing These are detailed in figure 54. 

instructions including I/O operations. The state of all 

registers should be initialized after execution of the 

fmode instruction. 


MODE # 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

Internal Bypass Mode (MO = 1 and M1 = 1) 

(1) 

- 


u' 

A^ 

X 

A^ 

X 

A^ 

Input Bypass Mode (M3 = 1) 

(2) 


- 


A^ 

A^ 

X 

A^ 

A^ 

Output Bypass Mode (M4 = 1) 

(3) 


u' 

- 

A^ 

X 

A^ 

A^ 

A^ 

Y Late Input Mode (M12 = 1) 

(4) 




- 


A^ 


A^ 

Double-Pump Mode (M9 = 1) 

(5) 

X 

U' 

X 

A^ 

- 

X 

A^ 

X 

Coprocessor Load Mode (M6 = 1) 

(6) 


X 


A^ 

X 

- 

X 

A^ 

Use of floadrc Operation 

(7) 

X 


A-' 

A^ 

A^ 

X 

- 

X 

Use of NEUT-, STALL-, or ABORT- Inputs 

(8) 



A^ 

A^ 

X 

A^ 

X 

- 


X = Should not be enabled together, = Can be enabled together 

Note: Although (7) and (8) are not selected by mode bits, the user is able to avoid the use of such 
functions or control lines. 

Double-Pump Mode (5) requires floadrc (7) to be used every cycle. 

Figure 54. Mode exclusion table 


RESET SEQUENCE 

Before initializing the contents of the Mode Register, 
the WTL 3132/ 3332 must be set to a stable state after 
power up. 

Repeating nop and I/O nop instructions for at least four 
cycles will flush the multiplier/accumulator pipeline 
and allow the internal states to become well-defined. 
Until this sequence terminates, the rest of the system 


should ignore the contents of the data buses and the 
state of the ZERO, FPCN and FPEX pins. 

The registers should then all be initialized to known 
values (including the Mode, Condition, Status and 
Tregs) while nops continue to be input. The WTL 
3132/3332 is then able to begin normal operation. 
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Division 


DIVIDE LOGIC UNIT 

The WTL 3132/3332 have a divide logic unit. This unit 
consists of a look-up table ROM and three delay 
stages. The first cycle of the flut operation transfers the 
Aadd operand (a) to the divide logic unit. During the 
next two cycles this operand selects a seed value for the 
reciprocal of the operand (1/a) from the look-up table. 
This result is written to the Cadd register on the fourth 
cycle. 

If the Internal Bypass Mode is enabled, the result can 
be copied to a multiplier/accumulator input port at the 
same time that the Cadd register is being written. 

The look-up result is an IEEE single-precision number 
whose fraction is accurate to seven bits of precision. If 
the input is positive or negative infinity (greater than 
#7F800000 or less than /^FF800000), the result is 
zero. If the input is zero, the result is f/7FFFFFFF 
(which gets clamped to #7F800000 during refine¬ 
ment). flut does not update the Zero, Condition, or 
Status Registers. 


Cycle # 

Opcode I/O 

Comment 

1 

flut .f1, .f31 

Ro (~1/a) 

2 

fnop 


3 

fnop 


4 

fmna .f1, .f31, 2, .f30 

2 - a X Ro 

5 

fnop 


6 

fnop 


7 

fmac .f31, .f30, 0, .f31 

= Ro X (2 - a X Ro) 

8 

fnop 


9 

fnop 


10 

fnnna .f1, .f31, 2, .f30 

2 - a X Rt 

11 

fnop 


12 

fnop 


13 

fmac .f31, .f30, 0, .f30 

Ra = Ri X (2 - a X Ri) 

14 

fnop 


15 

fnop 


16 

fmac .fO, .f30, 0, .fO 

e X 1/a 

17 

fnop 


18 

fnop 


19 

fnop ; fstore .fO 

store bla 

20 

fnop 

bla on X port 


Notes; Internal bypass enabled (MO = 1 and M11 = 1) 
X port output bypass enabled (M4 = 1) 


Figure 55. Recommended division sequence 


NOTATION: 
a = divisor (.f1) 

/?o = seed for I/a (.f31) 

R\ = first approximation (.f31) 

Ri - second approximation (.f30) 
b = dividend (.fO) 
b/a = result (.fO) 

ALGORITHM; 

R\ = Rox (2 - aX/?o) 

Rt = RiX(2 - ax/?i) 
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Division, continued 

DIVIDE CODE SUPPORT 

The initial approximation to l/a has to be refined by 
successive approximation. This accurate value of 1/a 
must then be multiplied by b in order to complete the 
divide operation (b/a). 

The programmer or compiler has to supply code to 
support the following sequence of operations: 

1. Execute flut to obtain seed value. 

2. Iterate from this value to obtain an accurate divisor 
inverse using the Newton-Raphson algorithm. 


PRELIMINARY DATA 

October 1987 


3. Multiply the dividend by the inverse of the divisor 
just generated. 

For division, the Newton-Raphson algorithm converges 
quadratically. Theoretically, the number of bits of pre¬ 
cision doubles with each iteration. Thus, two iterations 
should provide the full 23 bits of precision repres¬ 
entable by the IEEE 32-bit format. Quantization errors 
introduced by rounding, however, can prevent the Isb 
from being accurate. The code example provides bla 
to 22 bits of precision. 


Data Format 


32-BIT FLOATING POINT (IEEE STANDARD) 

The IEEE standard 32-bit floating point word divides 
into three fields: a sign bit, an 8-bit exponent and a 
23-bit fraction field (shown in figure 56). 

The value contained in the 8-bit exponent field ranges 
from -127 to 128 (#00 to #FF) (shown in figure 57). 
The fraction is multiplied by two raised to this power to 
produce a floating point value. 

The slgnificand field contains the 23-bit fraction and 
the hidden bit. Inserted during arithmetic processing. 


SIGN EXPONENT FIELD (E) FRACTION (F) 

BIT (8 bits) IMPLICIT BINARY POINT (23 bits) 


1 








_ 














E7 

[E6j 

E5 

1 E4 1 E3 

Ie2 

1^’ 

|E0 

f22 


|F20 |F19|F18 

If ,7 

|f16 

|f15 

|f14 

|f13 

|f12 

1F1,| 

|F10|F9 |F8 

|F7 |F6 

1 F5 1 F4 1 F3 1 F2 1 FI | FO 

31 

30 

29 

28 

27 26 

25 

24 

23 

22 

21 

20 19 18 

17 

16 

15 

14 

13 

12 

11 

10 9 8 

7 6 

5 4 3 2 10 


Figure 56. 32-bit floating point (IEEE standard) 


The value of an IEEE floating point number is 
determined by the following: 


E 

F 

VALUE 

DESCRIPTION 

1-254 

0 

Any 

Any 

(-1) s (1.F) 2e-'27 

(-1) s 0.0 

Normalized number (NRN) 

Zero 


Figure 57. 32-bit floating point value 


the hidden bit has a value of one for all normalized 
numbers and zero for zero. The fraction is the 23 bits 
to the right of the hidden bit. Bit F 22 has a value of 
2"* : bit Fo has a value of 2'^^; the hidden bit has a 
value of 2°. 

All constants are in this IEEE format. 
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Data Format, continued 


SIGN EXTENSION INTEGER FIELD 

(8 bits) (24 bits) IMPLICIT BINARY POINT 


SB 

BBS 

SBB 

mSm 

IS^B 


B5B 

BBB 

PS 


SB 

SB 

BBS 

SSB 

^^B 


^^B 


■SSS 


SB 

^BS 

^^B 




^^B 






B 

m 

m 

B 

B 

B 

B 

B 

lgl 


01 


B 

oa 

B 


QO 

m 

101 


B 

101 

B 

B 

B 

B 

B 

B 

B 

B 

fl 

B 

31 

30 

29 

28 

27 

26 

25 

24 

23 

22 

21 

20 

19 

18 

17 

16 

16 

14 

13 

12 

11 

10 

9 

8 

B 

6 

5 

D 

3 

2 

n 

0 


Figure 58. 24-bit fixed point (two’s complement) 


24-BIT TWO’S COMPLEMENT INTEGERS 

The value of the 24-bit integer field shown in figure 58 
can range from (-2^^) to (2^^-l) and must be sign- 
extended to 32-bits to be compatible with the WTL 
3132/3332 format. The eight-bit sign extension field is 
a repeat of bit 23, the sign bit of the two’s complement 
number. 

The user must ensure that integer operands conform to 
this format: integer results are automatically sign-ex- 
tended to match. 

FIX 

The fix function converts a number from floating point 
format to sign-extended 24-bit two’s complement inte¬ 
ger format. 

If the magnitude of its operand is greater than 2^^, 
Encni is set to 1, and Ml = 1, it will set the Condition 
Register to 1. This limit was chosen to allow software to 
test for the case n = 2^^, which cannot be represented 
by a 24-bit two’s complement number. If the operand 


is too large to be represented in the integer format, the 
result is clamped to either #007FFFFF or #FF800000, 
according to its sign. 

fix does not attempt to set the Zero or Status Registers. 
It executes in the same number of cycles as every other 
multiplier/accumulator instruction. 

FLOAT 

The float function converts a number from sign-ex- 
tended two’s complement integer format to floating 
point format. 

If its operand does not have consistent sign extension 
(bits 24-31 all equal), Encni is set to 1 and Ml = 1, it 
will set the Condition Register to 1. The result of a float 
operation on such an operand is not defined. 

float does not attempt to set the Zero or Status Regis¬ 
ters. It executes in the same number of cycles as any 
other multiplier/accumulator instruction. 


40 










UEITEK CORP DS TbbBaab 0D01D3b H D 


IEEE Considerations 

The WTL 3132 and WTL 3332 comply with the IEEE 
Standard for Binary Floating Point Arithmetic (P754) 
in most respects. The differences described below ap¬ 
ply to all of the arithmetic functions (fsubr, fsub, fadd, 
fmna, fmns, fmac, fabs). 

DENORMALIZED NUMBERS 

Denormalized numbers have a magnitude less than 
2"'^® but greater than zero. The IEEE standard in¬ 
cludes denormalized numbers to allow gradual under¬ 
flow for operations that produce results that are too 
small to be expressed as normalized numbers. The 
WTL 3132/3332 do not support denormalized num¬ 
bers. If the result of an operation is smaller than 2"'^*, 
it is replaced by zero and the Zero Register is set to 1. 
Denormalized operands are detected and flushed to 
zero (with the same sign) before the operation is per¬ 
formed; no indication of this is provided. 

NOT A NUMBER (NAN) HANDLING 

The IEEE standard represents NaNs with numbers that 
have the maximum exponent value and a non-zero 
fraction. The WTL 3132/3332 do not detect attempts 
to perform calculations on NaNs. Only the flut opera¬ 
tion may produce a NaN (when given zero as an oper¬ 
and). This is clamped to the appropriate infinity when 
refined by the divide code example given on page 38. 
Other operations may have undefined effects when 
given a NaN as an operand, so their use should, in 
general, be avoided. No arithmetic operation generates 
NaNs; all results with the maximum exponent have 
their fractions held to zero. 

INFINITY AND OVERFLOW 

The IEEE standard represents infinities with numbers 
that have the maximum exponent value and zero as the 
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fraction. The WTL 3132/3332 do not detect attempts 
to perform calculations on infinite operands. Some op¬ 
erations may have undefined effects when given an in¬ 
finite operand, so their propagation should, in general, 
be avoided. However, if an operation creates a result 
that is too large to be represented in the floating point 
format, its result is clamped to an infinite value as re¬ 
quired by the specification. The Status Register is set 
by the creation of an infinite value during an operation. 

UNDERFLOW 

When the result of an operation has a magnitude in the 
range 0 < n < 2"'^®, the WTL 3132/3332 round it to 
zero and set the Zero Register to 1. There is no way to 
distinguish underflow from a result that is exactly zero. 

ROUNDING 

The WTL 3132/3332 support only the round-to-near- 
est mode: the infinitely precise result of an operation is 
rounded to the closest representation that fits in the 
destination format. If the result is exactly halfway be¬ 
tween two representations, it is rounded to the nearest 
even fraction. 

The IEEE standard requires rounding to occur after 
each arithmetic operation. The WTL 3132/3332 do not 
round between the multiply and add components of 
the fmac, fmns and fmna functions. The error in the 
result is always less than two least-significant bits. 

If the ABin port of the ALU is set to the constant 0.0, 
then the fmac function performs a multiply that con¬ 
forms to the IEEE standard. The fix operation only can 
be set to round to negative infinity by clearing Ml to 
zero. 
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DC Specifications 

ABSOLUTE MAXIMUM RATINGS 


Supply voltage. -0.5 to 7,0 V 

Input voltage.-0.5 to 5.5V 

Output voltage .-0.5 to 5.5V 

Operating temperature range (Tcase) .-55 °C to 125 °C 

Storage temperature range.-65°C to 150°C 

Lead temperature (10 seconds) . 300 °C 

Junction temperature ... 175 °C 


RECOMMENDED OPERATING CONDITIONS 


PARAMETER 

COMMERCIAL 

MILITARY 

UNIT 

MIN 

MAX 

MIN 

MAX 

Vqq Supply voltage 

Tcase Operating temperature 

4.75 

0 

5.25 

85 

4,5 

-55 

5.5 

125 



DC ELECTRICAL CHARACTERISTICS 


PARAMETER 

TEST CONDITIONS 

commercial’ 

MILITARY ' 

UNIT 

MIN 

TYP 

MAX 

MIN 

TYP 

MAX 

ViHC 

High level clock Input voltage 

V OD = max 

Era 

■■■ 





V 

V,LC 

Low level clock Input voltage 

Vdd = min 



0.8 




V 

V,H 

High level input voltage 

V □□ = MAX 

2.0 






V 

V,L 

Low level input voltage 

Vdd = min 



0.8 




V 

VoH 

High level output voltage 

V dd “ MIN, loH “ ~1 -0 mA 







V 

VoL 

Low level output voltage 

V DO = MIN, loL = 4.0 mA 

m 

■ 

0.4 

■ 

■ 

■ 

V 

l,H 

High level Input current 

V OD = MAX, ViN = VpD 

■ 

■ 

n 

■ 

■ 


p, A 

l|L 

Low level Input current 

V QD = MAX, ViN = OM 



IW 




u A 

' OZL 

Tri-state leakage current low 

V DD = MAX, ViN = OM 




H 



H A 

1 OZH 

Tri-state leakage current high 

V dd = MAX, ViN = Vdd 

■ 

■ 

H 

■ 

■ 

■ 

p A 

' DD 

Supply current 

Vdd = max,Toy = MIN 

■ 

■ 


■ 

■ 

■ 




TTL inputs ^ 

■ 

■ 

200 

■ 

■ 

■ 

mA 

C|N 

Input capacitance 

Vdd = 5.0V 

■ 

10 

10 

■ 

■ 

!■ 

PF 

CcLK 

Clock capacitance 

T ambient = 25°C 


25 

30 




PF 

^OUT 

I/O, Output capacitance 

f = 1 MHz 


15 

20 




pF 

C OE 

OEX-, OEZ- capacitance 


■ 

20 

25 

■ 

■ 

■ 

pF 


NOTES: 1 Worst case over power and temperature range. 
2 Input levels are 0.4V and 3.4V 


42 




















































UEITEK CORP OE DE 


I TbbEflEL OODlOEfl a |”d 'piji)- /E-OS' 

WTL 3132/WTL 3332/XL-3132 
32-BIT FLOATING POINT 
DATA PATH 

PRELIMINARY DATA 

October 1987 

Timing Diagrams 

Test Switching Circuit 
2.4 V 

\ 400 n 

Output Q_ _ 

pin 

= = 40 pF 



Figure 61. Tri-state timing 
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Timing Diagrams, continued 



Figure 62. Signal timing diagram 
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AC Specifications 


AC SWITCHING CHARACTERISTICS _ 

I ~ AC TEST CONDITIONS ’■ ^ 


ViH =3,4V Voh=2.8V, Ioh=- 1.0 mA ^ g 

Vcc = MIN Yu = 0.4V VoL = 0.4V, loL = 4.0 mA 

5 ^ C C LOAD = 

40 pF 

DESCRIPTION 

WTL3132-120 

WTL 3332-120 

XL-3132-120 

COMMERCIAL 

WTL3132-100 

WTL 3332-100 

XL-3132-100 

COMMERCIAL 

WTL3132-080 

WTL 3332-080 
XL-3132-080 
COMMERCIAL 

WTL313; 
WTL 333 
XL-3132 
MILITAF 

>-120 

2-120 

-120 

1Y 

UNIT 


MIN 

MAX 

MIN 

MAX 

MIN 

MAX 

MIN 

MAX 

Tcy Clock cycle time 

Tch Clock high time 

Tcl Clock low time 

Tp, Clock rise time 

Tp Clock fall time 

1 

5 

5 

m 

5 

5 

80 


120 


ns 

ns 

ns 

ns 

ns 

Bus Inputs (C, X, Y): 

Tg Input setup time 

Tp, Input hold time 

20 

2 


15 

2 






ns 

ns 

Tsa Input setup time 

Tha Input hold time 
(X bus, coprocessor load 
mode) 

20 

2 


15 

2 






t?s 

ns 

Tsb Input setup time 

Thb Input hold time 
(X bus, double-pump or 

Y bus, late Input mode) 

20 

2 


15 

2 






ns 

ns 

Bus outputs (X, Z): 

Tq Output delay time 

Ty Output valid time 

3 

35 

3 

30 





ns 

ns 

Tena TrI-state enable time 4 
Tdis Tri-state disable time 5 


35 

35 


B 





ns 

ns 

NEUT-, STALL-, ABORT-: 
Tgpj Input setup time 

Thpj Input hold time 

20 

2 


15 

2 






ns 

ns 

FPCN, FPEX, ZERO: 

Tqp Output delay time 

Typ Output valid time 

3 

35 

3 

30 





ns 

ns 

Top Pipelined operation time 
per stage 

Tla Total latency 

reglster-to-reglster 

120 

360 


100 

300 


80 

240 


120 

360 


ns 

ns 

NOTES: 1. Worst case over time and temperature range. 

2. Input levels are 0.4 and 3.4V. 

3. Timing transitions are measured at 1.5V unless otherwise noted. 

4. Device must be powered for at least 20 ms before testing. 

5 . Tdis Is not tested but Is guaranteed by design. 


Figure 63. AC test conditions 
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Pin #1 
Identifier 

1 

2 

3 

4 

5 

^ 6 

7 

8 


9 

10 

11 

12 

13 

14 

15 

A 

GND 

NC 

NC 

X18 

NC 

NC 

NC 

X23 

VDD 

NC 

NC 

X27 

NC 

NC 

NC 

B 

NC 

NC 

GND 

X16 

NC 

NC 

X20 

X22 

X24 

X25 

X26 

NC 

X29 

X31 

GND 

C 

NC 

NC 

X15 

GND 

X17 

X19 

X21 

NC 

VDD 

NC 

X28 

NC 

X30 

TIE 

LOW 

F2 

D 

NC 

X12 

X14 











GND 

FO 

Adstl 

E 

X10 

XII 

X13 











F1 

AdstO 

AbinO 

F 

X8 

NC 

NC 











Abln2 

Abln1 

A.BORT- 

G 

NC 

X9 

VOO 











STALL- 

NEUT- 

Cwen- 








W 1 L 0 lOZ 








H 

NC 

X7 

VDO 




TOP VIEW 





Cadd2 

Cadd4 

Cadd3 

J 

X6 

XS 

NC 











Caddi 

Aadd3 

CaddO 

K 

NC 

NC 

NO 











Aaddi 

Aadd2 

Aadd4 

L 

X4 

X2 

NC 











BaddO 

Badd4 

AaddO 

M 

X3 

XI 

GND 











Dadd2 

Baddt 

Badd3 

N 

NC 

NC 

OEX- 

VDD 

FPCN 

GND 

TIE 

LOW 

TIE 

LOW 

GND 

GND 

GND 

VDD 

Daddl 

Dadd4 

Badd2 

P 

xo 

TIE 

LOW 

Encn1 

FPEX 

EncnO 

TIE 

LOW 

CLK 

GND 

GND 

GND 

GND 

GND 

VDD 

DaddO 

Dadd3 

R 

ZERO 

GND 

iocto 

IOCt1 

Mbin- 

TIE 

LOW 

TIE 

LOW 

GND 

GND 

GNb 

GND 

GND 

GND 

GND 

GND 


Notes: Pins marked "Tie Low" must be connected to ground. Pins marked "NC" should be left 
unconnected (floating). 


Figure 64. WTL 3132 and XL-3132 pin configuration 
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Pin Configuration, continued 


Pin #1 
Identifier 

A 

B 
C 
D 
E 
F 
G 
H 
J 
K 

L 
M 
N 
P 

R 
T 

U 

Notes: Pins marked “Tie Low” must be connected to ground. Pins marked "NC” should be left 
unconnected (floating). 

Figure 65. WTL 3332 pin configuration 


2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 


OEX- 

TIE 

LOW 

ZO 

Z2 

X3 

Z4 

NC 

X5 

Z7 

X9 

Z9 

XII 

212 

XIO 

X15 

Z15 

GND 


ZERO 

xo 

21 

X2 

NC 

NC 

Z5 

VDD 

VDD 

X8 

Z10 

Z11 

Z13 

X14 

GND 

GND 


Enoni 

GND 

X1 

Z3 

X4 

X6 

NC 

Z6 

Z8 

X7 

XIO 

X12 

Z14 

Z16 

X17 

217 


FPEX 

FPCN 






X16 

218 

X18 


Mbin- 

lOCtI 



Z20 

X19 

Z21 


CLK 

TIE 

LOW 


Z19 

X21 

X20 


TIE 

LOW 

Y28 


VDD 

VDD 

VDD 


TIE 

LOW 

Y30 

WTL 3332 

TOP VIEW 

X23 

X22 

222 

Z23 

X24 

224 


Y29 

Y26 

Y22 

Y24 

Y25 

Z25 

X25 

226 

Y23 

Y21 

Y27 

VDD 

VDD 

VDD 

Y18 

Y19 

Y20 

X27 

X28 

227 

Y16 

Y15 

Y17 



X29 

Z28 

X26 

Y14 

Y13 

Y12 


■ 



ZOI 

ZOO 

Z29 

Y10 

Y11 

Y9 

Dadd3 

Y6 

Badd2 

Aadd4 

Aaddi 

CaddO 

Caddi 

Y2 

YO 

B 

Adsti 

FI 

X31 

XOO 

VDD 

VDD 

Dadd2 

Y7 

Badd1 

BaddO 

AaddO 

Aadd2 

AaddO 

Cadd2 

Cadd4 

Cwen- 

AblnO 

Abin2 

AdstC 

FO 

TIE 

LOW 

Y8 

DaddO 

Dadd 

Dadd4 

BaddO 

Y5 

Badd4 


YO 

Y1 

Cadd3 

kBORT 



IB 

GND 

■ 
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WTL 3132 144-PIN PIN GRID ARRAY 




• 

Symbol 

DIMENSIONS 




INCHES 

MM 

,___ ir*' 

A1 

0.080+ 0.008 2.0 

3 t 0.20 




A2 

0.180 typ. 

4.57 typ. 

A3 

0.050 

1.27 

D 

1.575 sq.+ 0.016 

40.0 sq. + 0.41 

El 

1.400 sq.+ 0.012 

35.56 sq. + 0.30 

E2 

0.050 dia. typ. 

1.27 dia. typ. 

e3 

0.018 +0.002 

.46 + 0.05 

d 

0.070 dia. typ. 

1.78 dia. typ. 

e 

0.100 typ. 

2.54 typ. 


WTL 3332 168-PIN PIN GRID ARRAY 




PIN I 

KOVAR [ 



BOTTOM VIEW SIDE VIEW 


TOP VIEW 


Symbol 

DIMENSIONS 


INCHES 

MM 

A1 

0.095 + 0.009 

2.41 + 0,23 

A2 

0.180 typ. 

4.57 typ. 

A3 

0.050 

1.27 

D 

1.750 5 q.+ 0.018 

44,5 sq.+ 0.46 

El 

1.600 sq. 

40,6 sq. 

E2 

0.050 dia. typ. 

1.27 dia. typ. 

E3 

0.018 +0.002 

,46 + 0.05 

d 

0,070 dia, typ. 

1,78 dia, typ. 

e 

0.100 typ. 

2,54 typ. 
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Appendix A; The XL-3132 in the XL Environment 


WEITEK XL-SERIES 

The WEITEK XL-Series is a family of three VLSI 
processors: the XL-8000, a high-speed 32-bit integer 
processor; the XL-8032, a single-precision floating 
point processor; and the XL-8064, a double-precision 
floating point processor. 

These processors give the performance of bit-slice 
components, and are supported by a full complement 
of development tools. These include C and FOR¬ 
TRAN 77 compilers, and an assembler which produces 
code that may be optimized by an instruction paralle- 
lizer. The development system offers both hardware 
and software simulators with debugging facilities. The 
programmer remains free to create custom microcode 
routines for peak performance. 

This appendix is dedicated to the XL-8032 single-pre¬ 
cision floating point processor. Further information 
may be found in the XL-Series Overview, the XL-Series 
Hardware Designer's Guide, the XL-Series Program¬ 
mer's Reference Manual, the XL-8136 Data Sheet and 
the XL-8137 Data Sheet. 

The XL-8032 processor consists of three intercon¬ 
nected VLSI components; 

« XL-8136 Program Sequencing Unit (PSU) 

• XL-8137 Integer Processing Unit (IPU) 

• XL-3132 Floating Point Unit (FPU) 

Each of these components is manufactured in high- 
density, low-power CMOS and delivered in a 144-pin 
grid array package. Unlike a traditional microproces¬ 
sor, the XL-8032 is not constrained by the limits of 
circuit density or bus bandwidth imposed by a single 
chip in a small package. Consequently, bit-slice per¬ 
formance levels can be obtained both for integer and 
floating point operations. 

The XL-Series simplifies system design. Zero glue in¬ 
terfacing is provided by a small number of dedicated 
signals that communicate state information between the 
components. These signals and the system buses need 
only be connected as shown in figure 69 in order to 
create the XL-8032. The purpose of each interconnec¬ 
tion is described in more detail below. 


BUSES 

Four high-bandwidth system buses are provided by the 
XL-8032: 

1. Code bus. 

The 64-bit code bus feeds the code input ports of 
the PSU, IPU and FPU. The PSU and IPU share 32 
of the 64-bits; this half of the code word directs pro¬ 
gram control operations, address generation, loads 
and stores, and integer arithmetic. The remainder of 
the code word directs floating point operation. The 
designer may choose to lengthen the code word to 
add custom extensions to the processor architecture. 

2. Data bus. 

The 32-bit data bus is shared by the IPU and FPU, 
It allows bytes, 32-bit integers and 32-bit floating 
point numbers to be transferred between the proc¬ 
essing units and data memory, 

3. Code Address bus. 

The 32-bit code address bus carries the address of 
the next instruction from the PSU to the code mem¬ 
ory. A word address allows up to 32 Gbytes of 64-bit 
wide code memory. 

4. Data Address bus. 

A 32-bit data address bus carries the address of the 
next data read or write. The address is generated by 
the IPU and the data may be transferred to or from 
the IPU or FPU as required. A byte address allows 
up to 4 Gbytes of 32-bit wide data memory. Support 
for accessing bytes, half-words, and words is pro¬ 
vided by the IPU. 

The XL-3132 is designed to hook directly to the code 
and data buses alongside the other components of the 
XL-8032. When driven by the same system clock, the 
code word is sampled by all three components simulta¬ 
neously, and,, the data bus is driven or sampled at the 
same time in the cycle no matter which component is 
transferring information. 

The code and data memory systems may be imple¬ 
mented with SRAM, static column DRAM or inter¬ 
leaved DRAM. Both code and data caches are sup¬ 
ported by the XL-8032 chip set. More details are pro¬ 
vided in the XL-Series Hardware Designer’s Guide. 
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Appendix A: The XL-3132 in the XL Environment, continued 
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31 

28 

24 

X 

F 

X 

Aadd 

X 

Badd 

c 

w 

e 

Cadd 

E 

n 

c 

1 

o 

c 

X 

Abln 

A 

d 

s 

E 

n 

c 

(Seq) 

1 

I 

1 

1 

1 

1 

Dadd ' 

\ 







n 


n 

1 

t 



t 

n 

0 


1 

1 

JU - 

\ 

1 


- ---■■■■■' ■ ■-•-1-1-—J-' I - I I_t I _ . . Ju _ . „ 

131 5 1 5 1 5 1213213 5 


1. Dashed lines indicate bits in the sequencer field. 

2. Bits marked with an "X” are reserved bits. 

3. The meaning of each bit field is described in the body of this data sheet. 


Figure 66. XL-3132 signal assignments in the XL-8032 code word 


INSTRUCTION FORMAT 

The XL-8032 has a 64-bit instruction word. The bits 
that are directed to the XL-3132 are shown in figure 
66 . 

The lower 32-bits of the instruction word are shared by 
the IPU and the PSU. Bits 0-23 normally define the 
IPU operation. Bits 24-31 define the instruction flow 
control performed by the PSU. Five of these control 
bits, 24-28, are also used as the floating point register 
address (Dadd) when floating point load and store op¬ 
erations are performed, This saves on code bits and 
insures that the FPU and IPU never compete for the 
data bus. 

The XL-3132 has 34 C port inputs. When used in the 
XL-8032 configuration, the Dadd field is tied to the 
appropriate bits of the PSU code input. The Mbin- bit 
is tied to GND, reducing the number of bits used in the 
XL-8032 instruction word to 60; the remaining four 
are marked with an X in figure 66 and should be set to 
0 in any code to maintain compatibility with future ver¬ 
sions of the processor. 

LOAD/STORE MODEL 

The XL-Series has a consistent load/store model re¬ 
gardless of processor configuration. Each processing 
unit has its own register file; register moves between the 
IPU and FPU must be made through the data memory. 
Each of these register files is multi-ported and each 
register may be the operand source or result destina¬ 
tion of any instruction implemented by the unit. When 
an instruction takes more than one cycle to execute. 


the registers that supply its operands and receive its re¬ 
sult cannot be modified until it has been completed. 
This allows any instruction to be resubmitted for execu¬ 
tion after an interrupt with its original state. 

Transactions between the register files and data mem¬ 
ory are performed with dedicated load/store instruc¬ 
tions. The only restriction on loads and stores is that 
the operands of an operation be loaded before it is 
executed and that it shall have completed before its 
result is stored. This allows the parallelizer consider¬ 
able freedom to optimize register usage and I/O trans¬ 
actions. 

An example of the normal sequence of operation is 
given below. This example leaves several free cycles in 
which other loads, stores, and calculations could be 
performed in parallel. 


addr .ra 
fload .fx 
fabs .fx, ,fy 
nop 

addr .rb 
fstore .fy 


Figure 67. * rb = | *ra| 
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Using the Coprocessor Load Mode gives the fload and 
fstore operations the same timing as the IPU’s load 
and store operations. For loads, the address is pre¬ 
sented on the AD bus at the beginning of a cycle and 
the data is expected to be available on the D bus by the 
end of that cycle. For stores, the address is presented 
on the AD bus at the beginning of a cycle and the data 
is driven onto the D bus during the next cycle. 

Because the addr and fload instructions can be exe¬ 
cuted in parallel, they may be pipelined to support con¬ 
tiguous load operations (one per cycle). 

If, however, loads and stores are to be interleaved, 
each store must be allowed two cycles: the latter cycle 
is then always available for the I/O nop required when 
following an fstore by an fload. This has minimal im¬ 
pact on overall performance because loads usually out¬ 
number stores; and the parallelizer can organize I/O 
transfers efficiently. If the code constraints covered in 
the body of this data sheet are followed or if WEITEK 
software tools are used, then this load/store model will 
be obeyed. 


MODES 

The XL-3132 Mode Register must be initialized to the 
values given in figure 68 when coupled into the 
XL-8032 processor. The resulting programming model 
is illustrated in figure 70. Each selection is explained 
here: 

1. Internal Bypass Mode is enabled to maximize per¬ 
formance. 

2. The fix and float range test may be enabled or dis¬ 
abled as required. 

3. Input Bypass (and Double-Pump) Modes are dis¬ 
abled, because they cannot operate when the Co 
processor Load Mode is enabled. The MBin and 
ABin ports on the multiplier/accumulator cannot, 
therefore, receive operands from the C bus. 

4. Output Bypass Mode is enabled to maximize per¬ 
formance. 

5. Overflows may be enabled or disabled as required. 

6. Coprocessor Load Mode is enabled so that the 
XL-3132 matches the XL-Series load/store model. 

7. The Y port Late Input Mode is disabled; it only op¬ 
erates with the WTL 3332. 


MODE BIT 

LOGIC VALUE 

DESCRIPTION 

MO 

1 

Internal Bypass Mode (Aadd = Cadd) enabled 

M1 

1 

fix and float range test enabled (may be disabled) 

M2 

0 

Reserved: must be cleared to 0 

M3 

0 

Input Bypass Mode disabled 

M4 

1 

Output Bypass Mode enabled 

M5 

1 

Overflow exception enabled (may be disabled) 

M6 

1 

Coprocessor Load Mode enabled 

M7 

1 

Reserved: must be set to 1 

M8 

0 

FPEX active low and "sticky"* 

M9 

0 

Double-pump Mode disabled 

M10 

1 

Reserved: must be set to 1 

M11 

1 

Internal Bypass Mode (Aadd = Badd) enabled 

M12 

0 

Y Late Input Mode disabled 

‘These features are not available on all versions of the XL-3132; check the Programmer's 

Reference Manual for details. 


Figure 68. Mode selection table 
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Figure 69. XL-8032 schematic 
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Appendix A: The XL-3132 in the XL Environment, continued 



Figure 70. XL-3132 in XL mode 
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Appendix A: The XL-3132 in the XL Environment, continued 


CONDITIONS AND EXCEPTIONS 

The XL-8032 provides several signals which transfer 
state information from the processing units (IPU, FPU) 
to the sequencer (PSU). These are either conditions, 
upon which the PSU may decide to branch; or excep¬ 
tions, which require software intervention to recover 
gracefully. 

The FPCN output on the XL-3132 should be connected 
to the FPCN input on the XL-8136. The FPCN signal is 
enabled by the Encni..o field in the instruction word to 
indicate whether the result of an operation is = 0, < 0, 
or <= 0. The PSU may then execute a “branch on con¬ 
dition" instruction to selectively transfer program con¬ 
trol according to the outcome of this comparison. 

The FPEX- output on the XL-3132 should be con¬ 
nected the EXT4- interrupt input on the XL-8136. If 
the overflow enable bit in the mode register is set, then 
any arithmetic operation that generates an invalid re¬ 
sult can flag this exception to the PSU. The system 
software is expected to react appropriately to this inter¬ 
rupt. 

NEUT-, STALL- AND ABORT- 

The XL-8032 components all use the NEUT-, STALL- 
and ABORT- signals. These pins should be connected 
directly between the three chips in the XL-8032 proc¬ 
essor (see figure 69). 

NEUT- cancels the effect of the current instruction. 
The signal is generated by the PSU, It is normally used 


in the shadow of a delayed branch to prevent the in¬ 
struction in the pipeline from having any effect on the 
state of the IPU and FPU. 

STALL- cancels the effect of the next instruction. It 
should be generated by the code memory subsystem to 
indicate the delay or absence of the correct code word. 
This prevents any invalid operation that may be present 
on the code input at this time from affecting the state 
of the processor. It allows wait states to be inserted in 
code fetches, perhaps to allow for DRAM refresh or a 
code cache miss. 

ABORT- cancels the effect of both the current and the 
next instructions. It should be generated by the data 
memory subsystem to indicate the inability of the sys¬ 
tem to instantly access the required data word. This 
allows the canceled instructions to be repeated when 
the address becomes valid and for this retry to have the 
correct effect. It allows the data memory to be “not 
ready” if, for example, a page fault occurs. 
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Cycle # 

Opcode 





I/O 

Comment 

1 

fnop 




t 

fload .f6 


2 

fadd 

.f6 

0. 

.t1 

t 

fload .fO 

move 3.0 to .t1 

3 

fmac 

.fO, 

■ fO, 

0. 

.f6; 

fload .f5 

RoxRo 

4 

fmac 

.fO, 

.f5, 

0. 

.f3; 

fload .f4 

0.5 X fto 

5 

fnop 




1 



6 

fmna 

.f4, 

.f6, 

.t1. 

.f7 



7 

fnop 






3.0 -{axR^ ) 

8 

fnop 







9 

fmac 

.f3, 

.f7, 

0. 

.f1 


fli 

10 

fnop 







11 

fnop 







12 

fmac 

.f1. 

.f1, 

0. 

.f6 


f?i XRi 

13 

fmac 

.f1, 

.f5, 

0, 

.f3 


0.5xRi 

14 

fnop 







15 

fmna 

.f4, 

.f6, 

.t1. 

.f7 


3.0 -(axfti^) 

16 

fnop 







17 

fnop 







18 

fmac 

.f3. 

M, 

0. 

.f2 


Rz 

19 

fnop 







20 

fnop 







21 

fmac 

.f2, 

.f4, 

0. 

.f3 


s = axR 2 

22 

fnop 







23 

fnop 







24 

fnop 





fstore .f3 

store s 

25 

fnop 






s on X port 


Figure 71. 
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Appendix B: Programming Examples, continued 


The square root code given fetches a seed for the first 
approximation to the result and then proceeds to refine 
its value using the Newton-Raphson algorithm. It differs 
from the example for division given in the body of the 
data sheet in that no on-chip look-up table is provided 
for the seed value. 

The external table assumed here has the same accu¬ 
racy as the internal divide look-up table (all of the ex¬ 
ponent and the seven most-significant bits of the frac¬ 
tion). The following formulae may be used to calculate 
the entries of this table. 

Table entry exponent (G) and fraction (H) in terms of 
the operand values (E) and (F). Care should be taken 
to insure that zeroes, negative numbers, infinities and 
NaNs are handled correctly. 

G. [mzLj] 

ifE(S) = ;,■ f ^ .236 

'■ V257 + F 

2 

if E(8) = 0; ff=\ . -256 

'■ V512 + 2F 


4X4 MATRIX TRANSFORM 
NOTATION: 


Transform matrix A (an to 044 ) (.fl6 to .f31) 
Operand vector X (xi to X4) (.fO to .f3) 

Result vector Y (yi to y4) (,f4 to .f7) 

Partial results pi to p4 (Tregs) 

ALGORITHM: 

Y= AX 


yi 


3l1 ^12 3 i 3 3 i 4 


Xl 

Va 


^21 ^22 ®23 ^24 


X2 

Va 


^31 832 ®33 ^34 


X3 

y4 


841 842 843 844 


X4 


[ ] = truncate to lower integer. 


Cycle ff 

Opcode 





I/O 

Comment 

1 

fmac 

.fO, 

.116, 

0, 

.11; 

fload .10 

pi = aiixxi 


2 

fmac 

■fO, 

.f17. 

0. 

.12 


pi = aiixxi 


3 

fmac 

,f0. 

.118. 

0. 

.13 


pi = aiixxi 


4 

fmac 

.fO, 

.f19. 

0, 

.11 


P'1 = a4ixxi 


5 

fmac 

.f1. 

.120, 

.t1. 

.12; 

fload .fl 

pi = ai2XX2 + pi 


6 

fmac 

.f1, 

.f21. 

.12, 

.13 


pi = aiiXxi + pi 


7 

fmac 

■ fl. 

.122, 

.13. 

.11 


, pi = aiixxi + pi 


8 

fmac 

■fl. 

.123, 

.11. 

.12 


pi, = a42XX2 + pi 


9 

fmac 

.f2. 

.124, 

.12, 

.13; 

fload .12 

pi = aiiXxi + pi 


10 

fmac 

.12, 

.125. 

.13, 

.11 


pi = aiiXxi + pi 


11 

fmac 

.f2. 

.126, 

.11, 

.12 


pi = aiixxi + pi 


12 

fmac 

.12, 

,127, 

.12, 

.13 


pi = a43XX3 + p4 


13 

fmac 

.13, 

.128, 

.13, 

.14; 

fload .f3 

yi = ai4XX3 + pi 


14 

fmac 

.13, 

.129, 

.11, 

.15 


yi = aiiXxi + pi 


15 

fmac 

.13, 

.130, 

.12, 

.16 


yi = a34Xx3 + pi 


16 

fmac 

.13, 

.131, 

.13, 

.17 


y4 = C244XX3 + pi 


This example Is not Interruptable. 

Results can 

be read from the Z port of the WTL 3332 



Figure 72. 
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Appendix B: Programming Examples, continued 

RADIX-2 FFT 

In-place for algorithm for radix-2 butterflies. 

Two butterflies are evaluated per iteration. 


NOTATION: 


ALGORITHM; 

Source Operands 

Results 

Ar'=Ar + {BxXWx- BiXWi) 

AuAdt (.f4, .f20) 

Ax'. Axff 

Ai'= Ai + (5rXW'i+ BiXWr) 

Ai. Ai# (.f5. .f21) 

Ai', Aiff 

5r'=Ar - (BrXWr- SiXVVi) 

Fr, Bdt (.f3, .f19) 

Bx'. Bxff 

Fi'= Ai - (BrXW+ SiXVTr) 

B\. Biff (.fO. .f16) 

Bi', Bff' 



IVr. Wr# {.f2, .f18) 
Wi, W\# (.ft, .f17) 


Cycle ff 

Opcode 




I/O 

Comment 


fnop 




: fload .fO 



fnop 




: fioad .f16 


1 

fmns 

.f1. 

.fO, 

0, 

.t1 : fload .f1 

-BiXlYi 

2 

fmac 

.f2, 

.fO, 

0. 

.t2 : fload .f2 

+FiXlVr 

3 

fmns 

.f17, 

.f16, 

0, 

.t3 : fload .f17 

-BiffxWiff 

4 

fmac 

.f18, 

.f16, 

0. 

,t1 : fload .f18 

+B\ffxWxff 

5 

fmac 

.f3. 

.f2, 

.t1. 

.f8 : fload .f3 

BxXWx - BiXWi 

6 

fmac 

.f3, 

.f1. 

.t2. 

.f9 

FrXlVi + BiY.Wx 

7 

fmac 

.f19, 

.f18, 

.t3, 

.f24 : fload .f19 

BxffxWxff-BiffxWiff 

8 

fmac 

.f19, 

.f17, 

.t1. 

.f25 

BxffxWi# +BiffxWxtf 

9 

fadd 

.f4, 

.f8. 

.flO 

: fload .f4 

Ax' 

10 

fsub 

.f4, 

,f8, 

.f11 


Bx' 

11 

fadd 

.f5, 

.f9, 

.f12 

: fload .f5 

Ai' 

12 

fsub 

.f5, 

.f9. 

.f13 


Bi' 

13 

fadd 

.f20, 

.f24, 

.f26 

: fload .f20 ■ 

Axff 

14 

fsub 

.f20, 

.f24, 

.f27 

: fload .f21 

Bxff 

15 

fadd 

.f21, 

,f25, 

.f28 

; fload .fO 

Aiff 

16 

fsub 

.f21. 

.f25, 

.f29 

; fload .f16 

Biff 

This example Is not Interruptable. 

Results can be read from the Z port of the WTL 3332 


Figure 73. 
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Ordering Information 


PACKAGE TYPE 

TEMPERATURE RANGE 

ORDER NUMBER 

144-Pin PGA 

Tc = 0 to +85 ° C 

WTL 3132-GCD-080. -100, -120 

168-Pin PGA 

Tc = 0 to +85 ° C 

WTL 3332-GCD-080, -100, -120 

144-Pin PGA 

Tc = -55 to +125 ° C 

WTL 3132-GMD-080, -100, -120 

168-Pin PGA 

Tc = -55 to +125 ° C 

WTL 3332-GMD-080, -100, -120 


XL-Series customers should order the following: 


PACKAGE TYPE 

TEMPERATURE RANGE 

ORDER NUMBER 

144-Pin PGA 

Tc = 0 to +85 ° C 

XL-3132-GCD-080. -100, -120 

144-Pin PGA 

Tc = -55 to +125 ° C 

XL-3132-GMD-080, -100, -120 
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WTL 3132/WTL 3332/XL-3132 
32-BIT FLOATING POINT 
DATA PATH 


PRELIMINARY DATA 

October 1987 

Revision Summary 


CONTENTS 

This table lists major changes since the July 1986 printing of this data sheet. 


Change 

Description 


1. Features/Description/Architecture 

Revised, pages 1-3 


2. Figures 1-3 

New, pages 1-3 


3. Signal Description 

Revised, page 4 


4, WTL 3132 Block Diagram 

Revised, page 5 


5. WTL 3332 Block Diagram 

Revised, page 6 


6. Register file description (figure 6) 

New, page 7 


7. Multiplier/accumulator description 

Revised, page 8 


8. Figures 7 and 11 

New, pages 8 and 11 


9. Temporary register description 

Revised, page 13 


10. Figure 12 

New, page 13 


11. Internal Data Routing description 
(figures 16 and 22) 

New, page 16 


12. Input/output description 

Revised, page 20 


13. Figures 26 and 31 

New, pages 21 and 24 


14. Instruction Set description 
(figures 33, 35, and 36) 

Revised, page 26 


15. System Interfacing description 
(figures 44-51) 

Revised, page 30 


16. Initialization description 
(figures 18 and 19) 

Revised, page 35 


17. Division description (figure 56) 

Revised, page 38 


18. IEEE Considerations description 

Revised, page 41 


19. Figures 18 and 19 

New, page 43 


20. AC Specifications and figure 64 

Revised, page 44 


21. Figure 65 

Revised, page 45 


22. Figure 66 

Revised, page 46 


23. Appendix A (figures 68, 69, 70, 
and 71) 

New, page 49 


24. Appendix B 

Revised, page 55 


SUBJECT 



This table lists the component revision to 

which successive versions of this data sheet have referred. 

Data 

Date 

Component Suffix 

WTL 3132/3332 Preliminary Data 

July, 1986 

VIA 

XL-Series Product Status Report 

3/3/87 

V1A 

XL-Series Product Status Report 

7/8/87 

VIC 

WTL 3132/3332 Errata Sheet 

October, 1987 

VIC 

WTL 3132/3332 Data Sheet 

October, 1987 

V6D 
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Physical Dimensions 



BOTTOM VIEW SIDE VIEW TOP VIEW 


Symbol 

DIMENSIONS 


INCHES 

MM 

A1* 

0.091 + 0.010 

1.31 +0.25 

A2 

0.180 typ. 

4.57 

A3 


1.27 

D 

1.575 sq.+ 0.016 

40.0 +.41 

E1 

1.400 sq.+ 0.012 

35.56 +.30 

E2 

0.050 dia. typ. 

1.27 

E3 

0.018 +0.002 

.46 + .05 

d 

0.070 dIa. typ. 

1.78 

e 

0.100 typ. 

2.54 


*A1 = package plus lid 


Application Note 


COMPATABILITY WITH WTL 1264 AND WTL 1265 

There are two common WTL 1264/1265 configura¬ 
tions. In the first, one WTL 1264 is used per each 
WTL 1265. In the second, two WTL 1264 devices are 
used for each WTL 1265. We will discuss the 1:1 con¬ 
figuration. Contact WEITEK for a description of the 
2:1 configuration. 

Several programming models are used in this first con¬ 
figuration. As discussed in COMPATABILITY 
MODE, the most common programming model for the 
WTL 1264/1265 is maximum 64-bit throughput. If the 
compatability mode bit, Mia, is set to “0” the WTL 


2264/2265 can operate in the WTL 1264/1265 pro¬ 
gramming mode. The effect of Mia is different for the 
WTL 2264 and WTL 2265. 

If Mia is "0” in the WTL 2265, Ms can be used to con¬ 
trol the WTL 2265’s dummy pipes in the same way it 
controls Pipe 3 in the WTL 1265. In the WTL 2264, 
Mia set to “0” causes PAC to be 01, ACC to be 01 and 
Miato be “1”. The following tables illustrate the WTL 
2264/2265 mode configuration for several other WTL 
1264/1265 timing models. 


TABLE 18: WTL1264/VVTL2264 COMPATABILITY CHART 



WTL 1264 

t MCDE SETTINGS 

CCMPATIBLE WTL 2264-50 and -60 MCDE SETTINGS 

Cperation 

M6-4 

M 11-8 

M 7-6 

M 12 


M 9-8 

M 13 

M 7-6 


Pipeline 

Configuration 

Pipe 

Advance 

Accumulator 

■ 

Pipeline 

Configuration 

Pipe 

Advance 

Pipe 2 
Advance 

Accumulator 

64-bit max 
throuahput 

10 

0100 

01 

0 

111* 

XX 

X 

XX 

32-bit max 
throughput 

11 

0010 

01 

1 

111 

01 

0 

01 


00 

0000 

01 

1 

100 

01 

X 

01 

32-bit 
min iatency 

00 

0000 

01 

1 

100 

00 

X 

01 


‘Setting M ia to 0 causes Mi = 1, Ma-e = 01, M is = 1, Mz-e = 01. 
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WTL 3132 AND WTL 3332 32-BIT 
FLOATING POINT DATA PATH 


Physical Dimensions 


WTL 3132 144-PIN PIN GRID ARRAY 


PRELIMINARY DATA 

July 1986 


Symbol 


DIMENSIONS 


•®0OOQOO0OO0®O0® ,—e 
O0OOOO00OO0OO0Q f; 
O0O0OO00O00O00Q Ejl - 

00© - ©00 

000 ®00 I *= 

00© O0O ^ 

©00 m ■ 000 

000 ■ ■ 000 

000 000 

000 oe© 

000 -“OO0 STAND - 

000000000000000 OFF = 
O0000000O000000, ,K0VAR^ 

■^0000000000000© r 



BOTTOM VIEW SIDE VIEW 


TOP VIEW 



INCHES 

MM 

A1 

O.OBOi 0.008 

2.03 + 0.20 

A2 

0.180 typ. 

4.57 typ. 

A3 

0.050 

1.27 

D 

1.575 sq.+ 0.016 

40.0 sq. + 0.41 

El 

1.400 sq.+ 0.012 

35.56 sq.+ 0.30 

E2 

0.050 dia. typ. 

1.27 dia. typ. 

E3 

0.018 +0.002 

.46 + 0.05 

d 

0.070 dia. typ. 

1.78 dia. typ. 

e 

0.100 typ. 

2.54 typ. 


WTL 3332 168-PIN PIN GRID ARRAY 


Symbol 


DIMENSIONS 


@33©30©S3S0S'rOS©© 

©©33SS©S©S0£SS©®© 

©©S©© 0 ©©©©S©S 32 ®© 


o 


e®© ■ - ■ ©G® 

©G© ©®0 

os© ©0® CTAwn 

©G0 00® I STAND 

©®0©©®GG®©®e©G0®® 
ff®®®®0®®®I:X®®0®G§' 

- n 


INCHES 



0.095 

±0.009 

0.180 

typ. 

0.050 

1.750 

sq.+ 0.018 

1.600 

sq. 

0.050 

dia. typ. 

0.018 

+0.002 

0.070 

dia. typ. 

0.100 

typ. 


MM 

2.41 + 0.23 


4.57 typ. 


40.6 sq. 


1.27 dia, typ. 


.46 + 0.05 


1.78 dIa. typ. 


2.54 typ. 



























